The second data structure is the AS graph used to run the BSF algorithm on. It is implemented in the form of association lists. The implementation consists of two tables. The first table contains the list of direct inter-AS links. The second table is indexed by ASNs. For each ASN it indicates the offset in the first table at which the AS-specific interval starts. All ASes that have no links point at offset 0.
Both data structures are created based on a BGP data file, downloaded, for example, from the RouteViews web site. Since the file reflects the situation in the Internet, it should be periodically updated by an external program, so that the redirector always operates on a recent version. The AS-path length policy re-reads the file periodically (once a day, by default), and builds new versions of the lookup tree and the AS graph.
Interpreting the BGP data file contents can take several minutes, as the file is over 400MB large (decompressed). Thus, the file cannot be parsed during the server configuration phase, as doing so would increase the server startup time. A similar problem occurs when reloading the file, as the AS-related data structures that are being updated cannot be used to answer client queries. In our solution we use two separate BGP data sets, each consisting of one lookup tree and one map of the Internet, and switch between them as follows. During initialization, the AS-path length policy allocates two shared memory chunks for the two separate BGP data sets, and creates a special switching structure. As long as none of the BGP data sets is available, the redirector uses the static name-to-address mapping instead of calling the AS-path length policy. The actual loading of the BGP data file takes place after processing the first DNS request. Basically, the thread that was handling this request does not immediately return to accept a new one, but reads the file and prepares a new BGP data set, which is then placed in one of the shared memory chunks. As soon as it has finished, the data are made available to other threads by indicating the chunk as current in the switching structure (see Figure 4.8).
An analogous scheme is followed when reloading the BGP data file. One of the request-processing threads notices the necessity of updating the BGP data by checking the timestamp that can be found in the switching structure. Then, it re-reads the BGP data file and builds a new BGP data set, which is then placed in the non-current shared memory chunk. In this way, while the file is being parsed, the remaining request-processing threads can still use the current chunk. Once the new BGP data set is ready, the timestamp and the current chunk indicator are updated accordingly. For this reason, some version of the BGP data structures is constantly available, and the AS-path length policy can be used all the time.
The AS-related data structures are used in the main policy function that performs the replica selection. It first checks (in the switching structure) which BGP data set it marked as current, and then uses it to run the actual algorithm, described above. The function does not modify the shared BGP data. Therefore, no locks are needed here, and many instances of the function calls can be run simultaneously.