A Secure Instant Messaging System

Malgorzata Wrzesinska


Date: June 2002


Key words and phrases: instant messaging, security, scalability, Globe, SSL

Abstract:

In recent years, instant messaging systems have gained more and more popularity. Most of the existing systems have shown to have severe security problems with respect to user privacy, message authenticity and eavesdropping. Secure instant messaging systems are rare and, when they exist, suffer from scalability problems. This thesis presents the design and implementation of an instant messaging system that is both secure and scalable. It concentrates on defining the core functionalities of an instant messaging system. Special care is taken to avoid overloading the system with the bells and whistles of commercial applications. Details are given on the sets of security and scalability requirements the instant messaging system should comply with. This paper shows that it is possible to design and implement an instant messaging system architecture that copes with the required functionalities and satisfies the defined security and scalability requirements. A proof-of-concept implementation of this instant messaging system provides some more insight on how selected functions of the instant messaging system can be combined and implemented using Java and Netscape's Secure Sockets Layer.


Contents

Introduction

In recent years, instant messaging systems have gained more and more popularity as a new means of communication over the Internet. Instant messengers allow their users to exchange text messages but, unlike email, the sender and the recipient of a message are online at the same time. In this respect communicating via an instant messaging system is more similar to using telephone than mail.

Security is increasingly becoming an important issue. People want to retain their privacy. Communications should not be overheard, copied, blocked or modified by a third party. However, the Internet is known to be weak and vulnerable with respect to privacy. Therefore, there is a considerable effort being made to incorporate security into the existing communication systems and to create new secure communication tools.

Another important issue is scalability. The scalability of a system is its ability to handle large numbers of users distributed over geographically large areas without notably affecting the overall performance of the system. With the growing popularity of the Internet and the increasing number of users, systems that have not been designed to be scalable currently show some performance problems. For example, some very popular Web services, such as the Polish site of a very popular TV program ``Big Brother'', simply cannot handle all the requests of the people willing to access those pages.

There exist many instant messaging systems. The most popular ones, such as ICQ or MSN Messenger, can handle vast numbers of users and are reasonably scalable. However, they are reported to have major security flaws. Some other instant messengers, such as Iris, implemented most often as research projects claim to be secure. On the other hand, those systems suffer from scalability problems.

The main goal of my project was to create an instant messaging system which would be both secure and scalable. Additionally, I wanted this system to have a reasonable set of functionalities. In other words, we have to provide everything that is necessary to make it a convenient tool without overloading it with all the bells and whistles that can be found in commercial applications.

The project was done in the context of Globe, a distributed system being developed at the Vrije Universiteit, whose main concern is scalability. My instant messenger uses one of the Globe services, namely the Location Service, to locate its users in the Internet.

The rest of this report is organized as follows: Chapter 2 describes the functionalities of an instant messaging system. Chapter 3 defines the security and scalability requirements for an instant messenger. Chapter 4 introduces briefly the Location Service. In Chapter 5, I present an architecture of an instant messaging system which meets the requirements from Chapters 2 and 3. In Chapter 6, I describe the implementation of my instant messenger. Chapter 7 gives some views on future work and I draw my conclusions in Chapter 8.


Overview of instant messaging system functionalities

Before creating any application, it is necessary to define first what kind of functionalities it should provide. In this chapter, I define a set of functional requirements for an instant messaging system. I based my work on RFC2778 [12] which defines a minimal model for instant messaging. I also took a look at existing instant messengers and selected some of their features to extend this minimal model. I took care of choosing only features that are useful for instant messaging. I avoided all the bells and whistles with which the commercial systems are overloaded for competitive reasons. Among the systems I investigated were the popular Mirabilis ICQ [3] [11] [2] and MSN Messenger [7] [14], Jabber [16] [15], a very promising, interoperable system being developed by Jabber Inc., Gadu-gadu [1], a Polish instant messenger which I use daily and finally Iris [9], developed as a research project at Rice University. From the review of the related work, I concluded that an instant messaging system is a combination of two services. One of them is the instant messaging service, which is a means of sending small, simple messages that are delivered immediately to online users. The other one, the presence service, allows its users to learn who is actually online and to whom an instant message can be sent. In other words, the presence service is used to retrieve information about status of other users. Hereafter, I detail the two services of an instant messaging system.

Presence service

The presence service allows the users to know who else is present in the system and therefore who can receive instant messages.

Status

A user's status determines whether the user can be contacted or not. A user can be contacted when his status is ``online,'' which means that the user is logged in and possibly using the system. When the user's status is ``offline,'' the user is not logged into the system and cannot be contacted. Some existing instant messengers extend this basic set of statuses. Tables 2.1 and 2.2 summarize statuses available in MSN Messenger, ICQ and Gadu-gadu. Jabber and Iris do not offer this functionality.


Table 2.1: Statuses and their meaning in MSN Messenger, ICQ and Gadu-gadu
  MSN Messenger ICQ Gadu-gadu
online available for contact; a user is notified of incoming events by an alert sound and a flashing icon available for contact; a user is notified of incoming events by an alert sound and a flashing icon available for contact; a user is notified of incoming events by an alert sound and a flashing icon
free for chat - available for contact; a user is notified of incoming events by an alert sound and a flashing icon; incoming chat requests are automatically accepted -
away(MSN, ICQ)/I'll be right back(Gadu-gadu) a user is online but away from his computer; a user is notified of incoming events by an alert sound and a flashing icon; a user's status is automatically set to ``away'' when the user does not perform any action for a certain period or when his screen saver is activated; a user can also set the ``away'' status himself a user is online but away from his computer; a user is notified of incoming events by a flashing icon; a user's status is automatically set to ``away'' when the user does not perform any action for a certain period or when his screen saver is activated; a user can also set the ``away'' status himself a user is online but away from his computer; a user is notified of incoming events by an alert sound and a flashing icon; a user's status is automatically set to ``I'll be right back'' when the user does not perform any action for a certain period; a user can also set the ``I'll be right back'' status himself
extended away - a user is online but away from his computer for an extended period of time; when a message arrives, the client application behaves the same as when the user's status is ``away'' -
at dinner same as ``away'' but this status is not set automatically - -



Table 2.2: Statuses and their meaning in MSN Messenger, ICQ and Gadu-gadu - ctnd.
  MSN Messenger ICQ Gadu-gadu
at the phone same as ``at dinner'' but the incoming message is not signalized by an alert sound - -
busy same as ``at the phone'' - -
occupied - only messages marked by the sender as urgent are delivered; an incoming message is signalized by an alert sound and a flashing icon -
do not disturb - incoming messages are not signalized by a sound; the sender of a message receives a message that the user does not want to be disturbed -
offline a user is not available for contact a user is not available for contact a user is not available for contact


A change in a user's status occurs in various ways:

A taxonomy of users

RFC2778 distinguishes two types of users of the presence service: publishers who provide presence information and watchers who retrieve it. Watchers can be further divided into two categories: fetchers and subscribers. A fetcher fetches the presence information of a given publisher, that is, retrieves its current value. A subscriber subscribes to some publisher's presence information which means that he wants to receive notification when the publisher's status changes. When he wants to stop receiving notifications, he must unsubscribe. A publisher can cancel a subscription of a certain watcher which also results in stopping to send notifications to this watcher. A typical user of an instant messaging system is both a fetcher and a subscriber. When he logs in the system, he first fetches the presence information of all the publishers he subscribed to. Then, notifications are automatically sent to him as the status of one of the persons he subscribes to changes.

Privacy preferences

Being a user of an instant messaging system should not mean exposing one's presence information to anyone who might be interested in obtaining it. Therefore, instant messaging systems allow their users to define their privacy preferences, through which they can control who is allowed to observe their status. This includes: A users may also want to decide what kind of presence information about him is made available to other people regardless of what his status really is. The user can of course choose to expose his actual status but he may also want, for example, to appear offline while being online (the so-called ``invisible mode''). It is also possible for a user to have different settings for different users and groups of users. Existing instant messengers offer a way to set privacy preferences to a certain extent. ICQ, MSN and Iris introduce allow and disallow lists - lists of users that are allowed or not allowed to subscribe to the given user. Gadu-gadu offers a possibility of disclosing status only to the people on the user's contact list, that is, a list of people a user subscribes to. These three systems have the invisible mode. None of the systems I studied gives its users a possibility of neither canceling subscriptions, nor viewing their lists of subscribers.

Instant messaging service

An instant messaging service functions as a means of communication between online users. The instant messaging service makes use of the presence service to determine which users are online and which can be therefore contacted.

Types of communication

Existing instant messaging systems distinguish three types of instant communication, namely instant message, chat session and chatroom communications.

Instant message

An instant message is a small unit of data exchanged by the users of an instant messaging system. An instant message consists mainly of plain text. It can also contain other elements. For example, a system can support URLs which can be recognized by the client-side software and marked out in the text displayed to the user. Certain types of URLs can be bound with external applications, such as a browser, an image viewer, a PDF viewer. A user can launch the application by clicking on the URL. A multipart mime-typed message is another, further extension of a plain text message. Existing instant messengers usually offer their users a possibility of formatting their texts using various fonts and text styles (ICQ) or of including small images expressing emotions in their messages (ICQ, MSN, Gadu-gadu).

Chat session

Instant messaging systems also support a concept of chat session or, to use more human-friendly terminology, a conversation. In a chat session, a sequence of instant messages exchanged by two users can be recognized and presented to them as a coherent whole. The client-side software of an instant messenger presents all the messages of the session as a whole by, for example, displaying them in one window. To hold such a conversation, a user creates a chat session and then invites another user to join it. The other user can choose between joining the session and rejecting the invitation. If he joins, the session is established and from then on, the users can send and receive messages within that session. When a user wants to terminate a conversation, he leaves the session. A user can want to talk to more than one person at a time. To do so, a user could start multiple concurrent chat sessions, but one can hardly call it a group conversation: one's interlocutors cannot ``hear'' each other. We would rather want to have a group conversation in which each party can ``hear'' everything the other parties say. The concept of ad hoc group is the natural extension of a two-party chat session to a multi-party chat session. A user can invite more than one person to a chat session. An existing chat session can also be extended by letting the current members of the session invite new people. Ad hoc groups have two important features: Out of existing instant messaging systems only MSN Messenger offers ad hoc groups.

Chatroom

A chatroom can be seen as the persistent version of a multi-party chat session. A chatroom is a virtual ``place'' where people can meet and talk. A chatroom itself is not a group but rather a meeting point that the user to easily initiate a conversation. A chatroom is public, that is to say, directly accessible from the outside without further need of being invited by a current group member. Being persistent, a chatroom exists regardless of the number of people that are currently talking. As such, it has to be explicitly created and destroyed. When a user wants to take part in a chatroom conversation, he enters the chatroom. From then on, every message sent within this chatroom is delivered to this user. Messages sent by this user within this chatroom are delivered to all other users of the chatroom. A user may also want to send a message only to a specific user (or subset of users) of the chatroom without it being heard by other members. This operation is called whispering. Finally, when the user wants to end the conversation, he leaves the chatroom. Chatrooms offer some additional possibilities that do not exist in ad hoc groups. On entering a chatroom, a user can choose a nickname which allows him to conceal his true identity. This feature lets the users preserve anonymity. Chatrooms with nicknames are implemented in Jabber. ICQ also offers chatrooms accessible via a WWW browser. Other systems that I investigated do not offer chatrooms at all.

User preferences

To prevent users from being flooded by unwanted messages, instant messengers allow them to define their privacy preferences. A user is able to control who can send him messages and what kind of messages he wants to receive. Users may generally: Blocking messages from specific users is possible in most instant messengers. ICQ, MSN Messenger and Gadu-gadu allow a user to specify an ignore list - a list of people whose messages will be rejected. In Iris, this list is called a blackball list. Only MSN and Iris allow a user to accept messages only from specific users. None of the instant messengers I studied makes a distinction between privacy preferences for instant messaging and privacy preferences for presence notification. Therefore, if someone is on a user's ignore list, he can neither send messages to the user nor subscribe to his presence information. None of the investigated systems support blocking specific types of messages.

Delivering messages to offline users

A very important feature of instant messaging is that the communication is transient and not persistent. With persistent communication, a message is stored by the communication system until the moment it can be delivered. In contrast, with transient communication, a message is discarded if the receiver is not available. In context of instant messaging, transient communication means that messages are actually delivered only to online users. A message sent to offline users is not delivered. However, we saw that some of the existing instant messaging systems (for example Gadu-gadu) provide persistent communication by means of storing messages intended for users that are currently offline. Such a message is delivered as soon as its receiver becomes online again. Nevertheless, as instant messaging systems are basically meant to be a means of transient communication, such a feature does not seem to be particularly useful, while storing messages is a significant burden for servers. To achieve persistent communication, the existing email system can be used. When the receiver of a message is unavailable, the message can be sent as a regular email.

Additional services

The presence and instant messaging services form the core functionalities of an instant messaging system. However, to make such a system more attractive and easy to use, some additional services are also needed.

Global User Directory

To contact somebody, a user of instant messaging system must have that person's address or identifier. To assist in finding and passing addresses (identifiers), instant messaging systems provide a global user directory. The global user directory is a database containing user addresses possibly with other data, such as name, country, age, etc. The additional data can be used as criteria for searching the database. However, publishing data in such a database makes a user a potential victim of spamming with advertisements. Companies willing to advertise their products can make use of such databases to select a group of potential customers. Therefore, it should be completely up to the user whether his personal information is published in the global database. As the only supplier of that information, the user is also completely free to decide what kind of data concerning him should be inserted into the database and what should be left blank.

ICQ and Gadu-gadu offer such databases. They give their users complete freedom as to what data they actually insert into the database.

Address book

Address book is a database of users stored in the user's client-side software. As in the global user directory, the address book contains addresses of users and their personal data. A user adds entries to his address book either manually or by transferring data from the global user directory. All existing instant messaging systems offer this feature.


Security and scalability requirements

My goal within this project is to create an instant messaging system that is both secure and scalable. This chapter gives an overview of what the security and scalability problems are within the existing instant messaging systems. In this chapter, I also define security and scalability requirements for an instant messaging system. Finally, I give some considerations on the problem of combining security and scalability .

Security

Security in a distributed system has two aspects: secure channels and access control [18] [17]. A secure channel is a mechanism of protecting the communications between users or components of the system. A secure channel ensures confidentiality and integrity of the messages as well as mutual authentication of the users. Confidentiality means that the information is not disclosed to unauthorized parties. Integrity means that the information can be modified only in authorized ways. Finally, authentication refers to the fact that the claimed identity of the communicating parties can be verified. The other aspect of security is access control. Access control means that access to the elements of the system is explicitly verified and that access is actually granted only to authorized parties. Both secure channels and access control are necessary in an instant messaging system. Information has to be distributed via secure channels because users do not want their instant messages, presence notifications, subscriptions, etc. to be eavesdropped. They want to be sure that the message they receive is the same that has been sent and that it has been sent by the one they believe sent it. Access control is necessary because people want to be sure that the information they feed into the system, namely presence information, preferences and subscription lists, can be only accessed and modified by authorized persons.

Security problems in existing systems

Most of the existing instant messaging systems, including the most popular ones, such as ICQ and MSN Messenger, cannot be considered secure. They were often designed and implemented in times when nobody was thinking about security. Now, as security became more important, some efforts are being made to incorporate security features into those systems. Nevertheless, most of the systems I investigated (excluding Iris) turned out to have severe security flaws which I discuss below.

Instant messaging

Status

Most of the existing instant messaging systems do not guarantee that presence information is accessed only by authorized persons. Some of them do not even give their users the possibility of deciding who can see their status. They simply disclose status information to anybody that happens to be interested. This holds for older versions of Gadu-gadu; in newer versions, users can decide to disclose their status only to people on their contact list. Other systems allow their users to define their preferences for presence notifications but weak authentication and encryption mechanisms still allow unauthorized persons to read the status. MSN Messenger and Gadu-gadu send messages in plain-text. ICQ encrypts only client-to-server messages but using a very weak algorithm [10]. By listening on the wire an attacker can pick up a message and learn the status of a user from it.

In the following, I describe a possible attack on ICQ and emphasize the security flaws of this instant messaging system. Due to its weak authentication mechanisms, ICQ offers an easy way of spoofing messages. Messages are supposedly protected against spoofing by including in their headers a SESSION_ID, that is a random number chosen at login time. All messages from an ICQ client to an ICQ server must contain this number, otherwise the server ignores the message. ICQ message headers are encrypted. However, the encryption algorithm is proprietary and has shown to be very weak. Code for encrypting and decrypting ICQ headers can be found on the Internet [10]. Decrypting headers is thus not a problem for an attacker. The attacker can easily intercept a message sent between a client and a server, decrypt the header using one of the above mentioned programs and find the SESSION_ID, which can be used to spoof a message and send it to the server. By spoofing a message updating user preferences, the attacker can add himself to the victim's list of people allowed to read his presence information and then read the victim's status.

Subscriptions

In most existing systems, it is possible to subscribe or unsubscribe on behalf of another user. Messages are not protected or too weakly protected against spoofing (like in the case of ICQ). Anybody can therefore fabricate a subscribe or unsubscribe message. Messages are also not protected against modification, so the attacker can change a legally sent message to make its sender subscribe to somebody else than he wanted.

Instant messaging

Confidentiality of messages

MSN Messenger and Gadu-gadu send messages in plain text. ICQ uses a securely weak proprietary algorithm as described above. Therefore, anybody can intercept, that is, read a content of a message sent between two users.

Authenticity of messages

In ICQ, to forge a message, an attacker only has to send a packet encrypted with the ICQ algorithm to the victim. Since the algorithm is public, anybody can do it. The users of ICQ have absolutely no guarantee that the messages they receive are really authentic.

Flooding

Existing instant messengers provide their users with mechanisms for filtering messages. A user can define who can send him messages. However, an attacker can forge messages so that they look as if they were sent by somebody who is allowed to send messages to the victim. An attacker can then flood the victim with such messages.

Overall security problems

Most of the existing instant messengers are not secure for two reasons. First, they do not have proper security policies. A security policy defines among other things, which actions the entities of the system may take and those that are prohibited. Allowing everybody to access the presence information of other users is an example of a bad security policy. Second, the existing instant messengers lack mechanisms to enforce their security policies. Weak authentication prohibits proper access control. Information is distributed via insecure channels, that is without encryption, authentication and integrity checks. In the rest of this chapter, I will describe a security policy (a set of security requirements) for our system. This work is based on RFC2779 [13] which gives the security requirements for the model of instant messaging defined by RFC2778 [12]. Since my model is an extension of that of RFC2778, I also extended the set of requirements. The mechanisms I use to enforce my policy are described in Chapter 5.

Security requirements for presence notification

The presence service is a means of distributing presence information about the users of an instant messaging system. The presence service should guarantee its users a control over the distribution of their presence information, that is, what kind of information is actually distributed and who can read it.

I describe below the security requirements necessary for subscriptions to presence information and presence notifications.

Subscriptions

Notifications

Security requirements for instant messaging

The instant messaging service is a means of holding conversations over the Internet. A user of this service expects his conversations to be kept confidential, that is, nobody is able to overhear them. A user also wants to be sure that the messages he receives are the same that the ones that were sent to him. Finally, he wants to be sure that each message was truly sent by the party he believes sent it.

I describe below the security requirements the instant messaging service has to comply with, with respect of instant messages, chat sessions and chatrooms.

Messages

Chat sessions

Chatrooms

Scalability

As the Internet grows and the popularity of instant messaging increases, more pressure is put on the instant messaging systems to handle an increasing number of users from all over the world. Yet, the instant messaging systems are still expected to show an acceptable performance. Therefore, scalability is becoming an important design goal for instant messaging systems. Scalability has several aspects. A system can be scalable with respect to the number of users it can handle. One can always add a new user to the system without making worse its performance. A system can be also geographically scalable. It means that if its users or components lie far apart, this does not affect the overall performance of the system. Both aspects of scalability are important for instant messaging systems. Let us take a look at the scalability of existing instant messengers.

Scalability of existing systems

ICQ has a fairly scalable architecture. Chat sessions are implemented in a peer-to-peer fashion. To hold a conversation, two clients set up a direct point-to-point connection. Presence notifications are distributed via servers. Servers are replicated but they are all located on the same LAN. This architecture makes ICQ scalable with respect to the number of users. If there is too many users for the existing servers, a new server can be added. Geographical scalability is preserved only in the case of chat sessions (point-to-point connections). Presence notifications are sent through servers which can lie very far apart from some users.This may cause communication latencies. Since presence notifications are asynchronous messages, users do not notice that they receive notifications with a delay, unless they try to send a message to a user that has disconnected and whose notification has not yet arrived.

Gadu-gadu is implemented as a central server that maintains user accounts and distributes presence notifications. Such an architecture is inherently inscalable. A single server is always limited with respect to the number of users it can handle. Chat sessions are, however, implemented by means of point-to-point connections. Geographical scalability is therefore preserved in the case of chat sessions.

In Jabber, all the communications between the clients, including chat sessions, go through servers. However, since the servers are replicated and can be distributed over large areas, the scalability, both geographical and with respect to number of users is preserved.

Iris is the most centralized system among those I investigated. It uses a central server through which all the communications go. It does not use point-to-point connections. Therefore, it exhibits severe problems with respect to scalability. The number of users of such a system is limited by the capacity of the server. As for geographical scalability, if two users lie far apart from the server they will perceive the performance of the system as very bad due to the high wide-area network communication latencies. An interesting fact is that none of the existing systems turned out to be both secure and scalable. The question of combining security and scalability is dealt with in the following section.

Combining security and scalability

Combining security and scalability turns out to be a difficult task. One of the most important techniques of achieving scalability is distribution of the system components and algorithms carried out by the system. Centralized systems are always limited with respect to the number of users they can handle. A single server cannot cope with a very large number of users. Geographical scalability is also hindered by centralization. If the users of the system are distributed over a large area, some of them will lie very far apart from the central server. From those users' point of view, the system will show a poor performance, due to the important wide-area networks communication latencies. To cope with users distributed across large areas, the system must be also distributed over large areas. To cope with a large number of users, the system must consist of a large number of components, such as servers and clients. Unfortunately, a distributed architecture is a threat for security. The more components the system consists of, the more security ``holes'' it may have. A large number of components in the system enforces a large amount of communication, which may be subject to various security attacks. Distributed algorithms are also very vulnerable for security threats. They require that all components of the system that take part in executing the algorithm are trusted. If one of the components is ``taken over'' by the enemy, the security of the system as a whole is endangered. Thus centralization is the best way to achieve security while distribution is the best way to achieve scalability. In Chapter 5, I show how to reach a satisfying tradeoff for combining both security and scalability.


The Globe Location Service

My instant messaging system has been done in the context of Globe, a distributed object-oriented system developed at the Vrije Universiteit. Our instant messenger makes use of one of the Globe services, namely the Location Service. This section describes briefly the Location Service. More information about the Location Service can be found in [19].

The Internet allows people to easily share information. However, sharing information also means finding it. Objects on the Internet become more and more mobile. For example, there is an increasing number of mobile computers and telephones connected to the network. Similarly, software objects can also change their location. For example, web pages are moved from one server to another. Mobile agents travel across the Internet to gather information. Locating an object is usually supported by a so-called naming service. A naming service maps an object name to its address or addresses in case of a replicated object. The Domain Name System is an example of such a service. It maps a user friendly machine name to its IP address. DNS was designed to support stable name-to-address mappings. However, moving an object is likely to change its name-to-address mapping, making DNS not suitable for supporting mobile objects. The Globe Location Service was designed to handle frequently changing addresses of mobile objects.

Two-level-names

Figure 4.1: A two-level naming hierarchy
\includegraphics[width=\textwidth]{two-level-names.ps}

The Globe system introduces a two-level naming scheme in which a user-friendly object name is first translated into a so-called object handle which is subsequently resolved into the object address, namely a contact address. The first part of the resolution process is handled by a naming service, the second by the location service, as shown in Figure 4.1. The location service supports frequent updates of object addresses, as it is the case with mobile objects. An object handle is a globally unique identifier of an object. It is also location independent, which means that it does not change when the object moves. It was designed specifically for the location service and is by no means a user-friendly name like an URL. A contact address is a location-dependent identifier and is composed of the IP address of the machine the object is located at, the port number, the name of the protocol that has to be used to contact the object (e.g., TCP or UDP) and some additional information needed to contact the object.

The organization of the Location Service

Figure 4.2: The organization of the Location Service as a search tree
\includegraphics[width=\textwidth]{region-tree.ps}

The Location Service is organized as a worldwide search tree. The whole network is (virtually) divided into a hierarchy of regions, as shown in Figure 4.2. Each region is assigned a directory node. For example, the directory node PA covers the region PA which is composed of the regions A0, A1 and A2. A directory node is capable of storing addresses that lie in its region. Addresses of objects are normally stored in the leaf nodes. For each address, the Location Service constructs a path of forwarding pointers from the root to the leaf node where the address is stored. A forwarding pointer points to a child node where either the contact address can be found or another forwarding pointer. Addresses and forwarding pointers are stored in contact records.

Basic operations

The Location Service supports three kinds of operations for accessing or modifying contact records, namely: insert, delete and lookup.

Figure 4.3: Inserting an address when object is already known. Only the missing pointers are established
\includegraphics[width=\textwidth]{insert-general.ps}

Inserting a contact address

To insert a new object handle an insert request is issued at one of the leaf nodes, shown as step (1) in Figure 4.3. The request is propagated up the tree to the root. Then a path of forwarding pointers is created from the root to the leaf node where the insert was initiated. A contact record containing a forwarding pointer is thereby stored at each intermediate node. The address is stored at the leaf node. Objects can be replicated, that is can have multiple copies. A replicated object has a single object handle but multiple contact addresses (one for each replica). When inserting a contact address for a replica, a part of the path of forwarding pointers may already exist. When the insert request is propagated up the tree it eventually arrives at a node with a contact record for the inserted object (step (2) in Figure 4.3). The path of forwarding pointers is establish ed from this node down the tree (step (3) in Figure 4.3).

Deleting a contact address

Deleting a contact address occurs simply by removing it from the contact record where it was stored. If the removal leaves the contact record empty, the contact record is also removed. The parent node is subsequently requested to remove the forwarding pointer to the removed record, which may lead to recursively deleting the forwarding pointer in the parent directory node.

Looking up an object

Figure 4.4: Looking up a contact address
\includegraphics[width=\textwidth]{lookup.ps}

Looking up an object is initiated at a leaf node (step (1) in Figure 4.4). The request is propagated up the tree (step (2)) until a node with a forwarding pointer for the given object handle is found (step (3)). From this point the search continues down the tree (step (4)) along the path of forwarding pointers until the contact address is reached (step (5)).

The lookup algorithm guarantees locality of the search. When the object is located in a nearby region, only the local part of the tree is involved in the search. The search is gradually extended to cover a broader part of the network. The lookup algorithm also guarantees that if an object has more than one contact address, the nearest one is always returned first.


Design

The goal of this project is to create a system that fulfills the functional requirements from Chapter 2 and reaches the security an scalability design goals from Chapter 3. This chapter describes the architecture of the system that I designed. I also discuss some alternative solutions and motivate my design choices.

Global design

Figure 5.1: The organization of the system components
\includegraphics{global-design.eps}

The model of my instant messaging system consists the following components: the servers, the clients, the Location Service and the Global User Directory (GUD) server. The components are organized as shown in Figure 5.1.

My instant messaging system uses multiple servers. Each client maintains a connection with one of the servers. Using the connection with a server, a client sends and receives presence notifications, participates in a chatroom or sends subscription messages. Servers communicate with each other to distribute presence notifications, implement chatrooms and exchange user preferences and IP addresses. To implement chat sessions, clients set up direct point-to-point connections with each other.

Each server maintains a connection with the Location Service. A server takes care of inserting, removing and looking up contact addresses in the Location Service.

The Global User Directory server is a dedicated server maintaining a database of the users of my instant messaging system. A client directly sets up a connection with the Global User Directory server to access the database.

Servers

In this section, I discuss a number of design choices concerning servers. When designing a distributed system, we have to choose between a peer-to-peer architecture (without servers) and a client-server architecture (with clients and servers). When using servers, we further have to choose between a centralized (one server) and distributed (many servers) architecture. For my instant messaging system, I have chosen the distributed client-server architecture, therefore with multiple servers. In this section, I discuss the advantages and disadvantages of all the above mentioned options and motivate my choices.

The role of servers in my instant messaging system

In my instant messaging system, servers achieve different tasks including finding users, managing their preferences, distributing presence notifications and implementing chatrooms. Hereafter, I describe these different tasks.

Each server maintains a connection with one of the leaf nodes of the Location Service. After the connection has been set up, the server inserts its contact address together with the serverID. The serverID is a special object handle shared by all the servers. The contact address consists of the IP address of the machine the server is running on and the port number the server is listening for incoming connections from other servers. This allows the other servers to locate and connect to this server. The justification for sharing the serverID, is that whenever a server looks up another server, it is interested in locating any server and not a particular one.

Servers collaborate to distribute users' presence notifications. I describe the presence service in detail in Section 5.12. Servers also implement chatrooms in a distributed fashion. I give the details on chatrooms in Section 5.11.

Servers also manage user accounts. Each server stores data such as: user preferences and subscription lists.

For each of its clients, a server inserts its contact address together with the userID into the Location Service. A userID is a unique identifier of a user in the system. Note that the client's address is not inserted into the Location Service. Thus, the server remains the only component of the system that knows the IP address of the client. To contact the client one must first contact the server this client is logged into.

A server listens to two ports numbers. On one of the ports, the server waits for client connections without client-side authentication. Those connections are used to create a user account. A user that does not have an account cannot authenticate himself. After creating an account the connection is immediately broken. To log into the system and perform any action a user must connect to the second port with authentication of the both sides. On the second port the server also listens for connections from other servers.

Client-server versus peer-to-peer model

It was not obvious whether to use a client-server model at all. A peer-to-peer architecture, that is, a system consisting only of clients locating each other by means of the Location Service, was an alternative. A peer-to-peer architecture has many advantages, especially with respect to scalability. The total number of clients is not limited by the capacity of the server. The performance of the system is not hindered by large distances between clients and servers. Exploitation of locality is inherent in peer-to-peer systems: the closer the clients lie to each other, the faster they can communicate. These goals are often harder to achieve in client-server architectures.

On the other hand, such a high degree of distribution introduces many problems with respect to security, as it was already mentioned in Chapter 3. Since the client-side software is distributed freely, nobody can guarantee that a particular client really does what it is expected to do. Therefore, since it is impossible to protect client-side software against tampering with, it is not wise to trust clients too much. As it will be shown in the following sections, implementing some parts of the instant messaging functionality in a pure peer-to-peer model would require distributed algorithms which rely on the trust in clients. Therefore, the peer-to-peer architecture is not suitable for my instant messaging system.

Moreover, some features, such as chatrooms, are not easily implemented in a peer-to-peer model. Using servers for implementing chatrooms allows users to easily come and go, while preserving the chatroom persistence. The concept of a chatroom is therefore implemented in a client-server model.

As mentioned in Chapter 3, to alleviate the risk of denial-of-service attacks against clients, we want to protect clients' IP addresses and reveal them only to the persons authorized by the owner of the IP address. Unfortunately, in the Location Service, lookup operations are not subject to access control. Therefore, I decided to make use of an intermediary between the Location Service and clients. The server is such an intermediary. The client's IP address is not inserted into the Location Service. The server, the client is logged into, inserts its own IP address instead. If anyone wants to contact the client, he must first contact the server.

Single versus multiple server

Another question we came across was whether to use a single server or multiple servers. A centralized architecture is relatively easy to secure but is also inherently inscalable. A single server is always limited with respect to the number of users it can handle. A centralized architecture does not cope well with large geographical distribution of clients. The clients that lie very far apart from the server would perceive the performance of the system as very bad, due to high communication latencies in wide-area networks. Replication of servers solves both problems. If the number of clients exceeds the capacity of existing servers, it is always possible to add a new server. Problems with large distances are also alleviated since each client can find a server in the neighbourhood. That assumes that the servers are well spread over the area where clients can be found. Ideally, clients should be able to find a server on their local network or close by their Internet Provider. Naturally, if the clients that want to communicate are in two different parts of the world, they will suffer from communication latencies and nothing can be done about it. Nevertheless, servers will not prevent clients that lie close to each other to communicate fast. Thus, the replication and careful placement of servers makes it possible to exploit locality.

Clients

A client is a relatively simple piece of software whose most advanced part is probably the graphical user interface. A client sets up a connection with the nearest instant messaging server (further referred to as the client's local server). The IP address of the nearest server is stored in the client's configuration file.

The client sends and receives presence notifications, subscriptions and preferences updates through the connection with its local server. A client also participates in chatrooms via its local server. To establish a chat session, a client set up a point-to-point connection with the other client. However, before setting up such a connection, the other client's IP address is needed. This address is obtained via the client's local server, as I explain in Section 5.10.4.

The Global User Directory server

The Global User Directory (GUD) server maintains a database of the users of the instant messaging system. Each user can decide whether he wants to insert any information about himself into the Global User Directory. For each user who decides to store his data in the Global User Directory, the GUD server stores the user's user handle along with some additional data such as name, age, email address, etc. A user handle consists of the userID and the address of the user's home server: the server where the user has first registered. The user handle is all the information that is needed to contact the user in my instant messaging system.

A user can decide what information he wants to insert into the Global User Directory, for example he can give only his name and not his age and email address.

The Global User Directory stores also information about the chatrooms available in the instant messaging system. For each chatroom, the Global User Directory stores the identifier of this chatroom, called chatroomID, together with a description of the chatroom, that is, what kind of topics are discussed in the chatroom.

Each user of the instant messaging system has the right to read the data contained in the Global User Directory. To read the database, a user's client sets up an SSL connection with the GUD server. While establishing the connection, both sides have to authenticate themselves to each other. Through the SSL connection with the GUD server, a client sends queries to the database. A user is the only one that can update his information in the Global User Directory. To update the data about himself, a user sets up an SSL connection with the GUD server. GUD servers authenticates the user and gives him write access only to his data. Through the SSL connection with the GUD server a user sends update requests. The information concerning chatrooms is inserted into the database by the administrators of the instant messaging system.

Implementing the Global User Directory with a centralized, dedicated server hinders the scalability of the system. The number of users whose data can be stored in the Global User Directory is limited by the capacity of the server.

I have chosen the central server for the sake of simplicity. Implementing the Global User Directory in a scalable way would cost much effort while the Global User Directory is only an additional service of an instant messaging system.

The use of the Location Service

The instant messaging servers use the Location Service to locate users, other instant messaging servers and chatrooms. Every user is given a unique object handle, the userID. Servers have a common object handle, namely the serverID. Each chatroom has an object handle, the chatroomID. The Location Service is therefore used by the system in the following ways: In order to provide the above services, the Location Service stores the following information:

Communication

The greatest weakness of existing instant messaging systems lies in sending information along insufficiently protected or completely insecure channels. On the other hand, creating a secure communication channel is often a difficult task. As an example, consider ICQ. This instant messaging system uses its own cryptographic algorithm which was very quickly hacked. The code for decrypting ICQ messages is accessible on the Internet. This example shows that relying on proprietary, unproven solutions is not a good idea. Therefore, we wanted to use existing security mechanisms that have been already tested in a hostile environment with many security threats. We found a solution to our problem in Netscape's Secure Sockets Layer (SSL) system.

SSL

SSL is a suite of cryptographic protocols that provides strong authentication and encryption. The main advantage is that it has been widely used for many years to protect sensitive information such as that being sent during financial operations. It has not been proven beyond all doubt that SSL is secure against any possible attack. However, successful attacks on SSL are not known. Its long functioning in a hostile environment where the stakes on breaking into a system are extremely high makes it one of the most reliable security tools that currently exists. Therefore, I decided to abandon any effort to create my own security solutions and to rely completely on SSL.

Certificates

SSL uses X509 certificates for authentication. A certificate provides a binding between the subject's name (the name of the entity for which the certificate was issued) and its public key. If the subject has the corresponding private key, it can prove that it is the legal owner of the name contained in the certificate. A certificate is signed with the issuer's private key. The public key of the issuer is assumed to be publicly known, so everybody is able to check the certificate's authenticity. However, if the issuer is not publicly known, he can add his own certificate to the certificate he creates. Such a set of certificates is called a certificate chain. It contains the certificate of the subject followed by the certificates of the subsequent issuers. The last certificate in the chain is a self-signed certificate of a well-known authority. In my system, there is such a Certification Authority (CA) whose public key is known to every component, that is, every client and every server, of the system.

Every component of the system stores the CA's self-signed certificate. Certificates of the servers are issued by this CA. The subject name field of the server's certificate contains the text ``Server.'' A user has a certificate issued by one of the servers. The subject name field contains the user handle. A user, to authenticate himself to a server or another user, submits a certificate chain consisting of his certificate, the certificate of the server and the self-signed certificate of the CA. A server's certificate chain consists of the server's certificate and the certificate of the CA. The Global User Directory server has a certificate issued by the CA. It uses the certificate to authenticate itself to the clients.

Group communication

The decision to use SSL has some serious implications for the design of the instant messaging system. SSL is designed for protecting point-to-point connections. It does not support any form of group communication. However, various services require group communication in my instant messaging system: Therefore, I considered alternative solutions like secure multicast. Although there is a lot of research being done on this subject, there are hardly any widely accepted and ready-to-use security mechanisms for group communication. Therefore, I decided to make use of SSL and to implement group communication using point-to-point connections. The topology of these connections is described in the following sections.

User profiles

A user profile is the data concerning a user stored in the instant messaging system. It includes: Instant messaging privacy preferences define who can initiate a chat session with a user. Instant messaging privacy preferences consist of three kinds of settings: the allow list, the disallow list and the default preference. An allow list contains user handles of the users that are allowed to initiate a chat session with the given user. A disallow list contains user handles of the users that are not allowed to initiate a chat session with the given user. Finally, a default preference defines whether the users that are neither at the allow list nor at the disallow list can initiate a chat session with the given user. Instant messaging preferences are used both by the user's client and by the user's local server. The user's local server checks the instant messaging preferences of the user before it passes the user's IP address to some other user. Only users that are allowed to initiate a chat session with the given user are allowed to obtain his IP address (more details on obtaining users' IP addresses can be found in Section 5.10.4). The user's client checks the instant messaging preferences when an invitation message arrives. If the invitation is sent by a user that is not allowed to initiate a chat session with the given user, the invitation is automatically rejected. An invitation from a user that is allowed to initiate a chat session with the given user is not automatically accepted. The user who receives an invitation is asked by his client whether he wants to accept or reject the invitation. Presence privacy preferences define who can or cannot subscribe to a user's presence information. Presence privacy preferences also consist of an allow list, a disallow list and a default preference. Their meaning is similar as the instant messaging privacy preferences. When someone wants to subscribe to the user's presence information, the subscriber's local server first checks the user's presence privacy preferences.

Managing user profiles

The users of the system may be mobile. Therefore, a user should not be assigned to any specific server. A user must always have the possibility of logging into the current nearest instant messaging server. This requirement introduces a problem concerning where to store user profiles.

The server the user is logged into needs an access to the user profile to check the user instant messaging preferences when an IP address request arrives. The client needs an access the profile to check user instant messaging preferences when an invitation arrives and to let the user view and update his profile. Other servers also need an access to the user profile to process subscription requests.

In the following paragraphs, I discuss several possible solutions to the problem of storing user's profiles.

On clients

Figure 5.2: Storing client profiles - on clients
\includegraphics[width=\textwidth]{profiles1.eps}

A user can store user profile in his client-side software. When a user logs into a server, the user profile is copied from the user's client to the user's local server. The client and the local server have an access to their local copies of the user profile. When other servers need an access to the user profile to process subscription requests, they contact the user's local server. If the user is online, the address of his local server can be found by performing a lookup in the Location Service. When the user updates his profile, his client forwards the updates to his local server. When the user logs out, his user profile on the server is cleared. This approach is illustrated in Figure 5.2. With this approach, a user is free to log into any server he wants. Logging into a server other than the usual one does not affect the performance of the system from this user's point of view, as it is the case with some other solutions (as we discuss below). The major drawback of this solution is that the user profile of a given user is accessible only while the user is online. The user profile is, however, sometimes necessary when the user is not logged into the instant messaging system. Subscribing is such a case: when somebody wants to subscribe to a user's presence information, the user's presence privacy preferences must be checked. If the user is offline, checking cannot be done and the subscription cannot be processed until the user is logged in again. I did not want my instant messaging system to have such limitations. Another problem is introduced by not being able to access a user's profile while he is offline. Not being able to access the user profile and proceed with subscriptions may cause a violation of the user's privacy by disclosing his status to unauthorized persons. To check somebody's status, one has just to try to subscribe to this person's presence information. If the subscription is processed, the victim is online; otherwise he is offline.

Centralized database

Figure 5.3: Storing client profiles - centralized database
\includegraphics[width=\textwidth]{profiles2.eps}

Using a central database for storing user profiles is another solution. The database can be a part of one of the instant messaging servers or stored on a separate, dedicated server. All the user profiles are kept in one place. When a user logs into one of the servers, the server contacts the database and retrieves the user profile. The profile is also copied to the client. When the user updates his profile, the updates are propagated by his client to his local server. When the user logs out, his local server forwards all updates to the central database and removes its local copy of the client profile. Thus, while the user is online, the up-to-date version of his profile is stored only on his client and on his local server. When other servers need access to the user profile to process subscription requests, they contact the user's local server if the user is online or the central database if the user is offline. If the user is online the address of his local server can be found by performing a lookup in the Location Service. The centralized database approach is shown in the Figure 5.3. The centralized database solution does not have the drawbacks of the on-clients solution: the user profiles are always accessible regardless of whether the user is online or offline. Another advantage of this approach is its simplicity. The user data are not replicated. There is therefore no redundancy of data.

The main drawback of this solution is its lack of scalability. The total number of users of the instant messaging system is limited by the capacity of the database. With the growth of the instant messaging system, the database may become a bottleneck. This approach also lacks geographical scalability. Some servers are likely to lie far away from the database and to suffer from high wide-area network communication latencies. Finally this solution introduces a single point of failure by keeping all the user profiles in one single place.

Distributed database (home server approach)

Figure 5.4: Storing client profiles - distributed database
\includegraphics[width=\textwidth]{profiles3.eps}

An improvement over the centralized database approach is to use a distributed database. In the distributed database approach, each server stores the profiles of part of the users. Preferably, the profile of a user is stored on the server he logs into most often. Another option is that the user's profile is stored on the server that created that user's account. The server which stores the user's profile is called the user's home server.

When a user logs into some other server his profile is temporarily copied to the other server, for example client C2 in Figure 5.4. This server becomes the user's local server. The profile is also copied to the client. All updates made by the user are forwarded by his client to his local server and transferred to the home server when the user logs out. Once the transfer is completed the profile on the local server is cleared. Thus, while the user is online, the up-to-date version of his profile is stored only on his client and on his local server. When other servers need an access to the user profile, they contact the user's local server if the user is online or the user's home server if the user is offline. If the user is online, the address of his local server can be found by performing a lookup in the Location Service. The address of the user's home server is a part of his user handle. The distributed database approach is illustrated in Figure 5.4. The advantages of this solution is its scalability with respect to the number of users. The capacity of one server is limited but we can always add new servers when the existing ones become overloaded. Another advantage is that there is no unnecessary redundancy of user profiles. The user profile is only to be found at the user's client, the user's home server and at his local server if those two servers differ. Nevertheless, the distributed database works best when most of the users generally log into the same server. Highly mobile users constitute a problem since their user profile must often be sent to new servers and may need to travel through long communication paths. Another disadvantage is that, if one of the servers is down, the users for whom it is the home server cannot use the instant messaging system, since their user profiles cannot be accessed.

Replicated database

Figure 5.5: Storing client profiles - replicated database
\includegraphics[width=\textwidth]{profiles4.eps}

The replication of the user profile is a way to reduce the problem encountered with mobile users. With a replicated database, a user's profile is replicated on every server: each server stores the profiles of all the users. When a user logs into one of the servers the profile is copied to his client. Updates are transferred by the user's client to the user's local server and propagated to all other servers in the system when the user logs out. Thus, while the user is online, the up-to-date version of his profile is stored only on his client and on his local server. When other servers need an access to the user profile, they contact the user's local server if the user is online or use their own copies if the user is offline. If the user is online, the address of his local server can be found by performing a lookup in the Location Service. This approach is shown in Figure 5.5.

A user can log into any server and always obtain an equal performance of the instant messaging system from his point of view. The replicated database approach therefore solves the problem of mobile users. Another advantage is that even if some servers are down, all users can still access their profiles and use the instant messaging system. However, the main drawback of this solution is its lack of scalability. Since each server must store the profiles of all users, the total number of users is limited by the capacity of a single server. Another disadvantage is the redundant storage of profiles. When the number of servers grows, the number of copies of the profile of each user grows as well. Some of these copies are completely unnecessary. Most of the users will connect to a limited number of servers and it is likely that no user will connect at least once at each server. As a consequence, most of the servers will never use part of the profiles they store.

Hybrid solution

Figure 5.6: Storing client profiles - hybrid solution
\includegraphics[width=\textwidth]{profiles5.eps}

An alternative solution is to combine database distribution and user profile replication. If a user is stable, that is, generally logs into the same server, his profile is stored only on this server. When the user is highly mobile, his profile is replicated across the servers he usually logs into. When a user changes his behavior, for example logs into a new server, his profile may be replicated. A new replica is added at each new server or an existing replica is removed at a server the user does not log into anymore.

When a user logs into a one of the servers, the server first checks whether it already stores a copy of the user's profile. If not, it must find a server that has a copy of the user's profile and transfer the profile from the other server. The user profile can be found by, for example, broadcasting a profile request to all servers. After the user profile has been obtained, it is also copied to the user's client. When the user updates his profile, his client forwards the updates to his local server. When the user logs out, the updates must be propagated to all other servers holding this user's profile. It can be done by broadcasting the updates to all servers. Thus, while the user is online, the up-to-date version of his profile is stored only on his client and on his local server. When other servers need an access to the user profile, they contact the user's local server if the user is online or find another replica of the user profile (by broadcasting a profile request) if the user is offline. If the user is online, the address of his local server can be found by performing a lookup in the Location Service. The user profile on the user's local server is not cleared immediately when the user logs out. The server keeps the profile and removes it only when the user does not log into this server for a long time.

This solution combines the advantages and alleviates the problems of the distributed and replicated database approaches. The main drawback of this solution is its relative complexity.

My design

For my instant messaging system, I have chosen the distributed database (home server) approach. A user's profile is stored on the server that created this user's account and is temporarily copied to other servers as the need arise. This solution is directed towards stable users but still allows mobility. I selected this solution for its simplicity.

Connecting to the server

In this section, I describe the protocols for creating a user account, logging in and logging out at a server.

Creating user account

Figure 5.7: Creating user account
\includegraphics[scale=0.5]{create-account.eps}

When a new user wants to make use of the instant messaging system, he must first register at one of the instant messaging servers. The user's client sets up an initial SSL connection with a server. The client connects to the port on which the server listens for connections without client-side authentication, as the user cannot yet authenticate himself. The client generates a public-private key pair and sends to the server a create-account request containing the public key, shown as step (1) in Figure 5.7. The server contacts the Location Service which then generates a userID for the new user (steps (2) and (3) in Figure 5.7). Next, the server creates a user handle by combining the userID with its own IP address. The server also creates a default user profile for this user and stores it locally in its own database. The last action taken by server is generating a certificate for the new user and creating a certificate chain consisting of the newly generated certificate, the server's certificate and the Certification Authority certificate. The user certificate contains the user handle generated by the server and the public key submitted by user. Note that the server does not know the user's private key. The user handle and the certificate chain is sent back to the client and the connection is closed (steps (4) and (5)).

Logging in

Figure 5.8: Logging in
\includegraphics[scale=0.5]{log-in.eps}

When logging into the instant messaging system, a client sets up an SSL connection with a server and sends a login request (step (1) in Figure 5.8). The client connects to the port on which the servers listens for connections with client-side authentication. Both client and server perform an authenticity check against each other's certificate. The server extracts the user handle from the client certificate. If the authentication succeeds, the server inserts the userID together with its own IP address into the Location Service (step (2) in Figure 5.8). The server also inserts the client into its database of logged users. If the server is not the user's home server, the user profile has to be downloaded from the home server. In that case, the user's local server sets up an SSL connection with the home server and requests the user profile (step (3)). After the profile has been obtained (step (4)), the SSL connection with the home server is broken and the profile is temporarily stored on the local server (until the user logs out). Next, notifications for the user's subscribers are sent (step (5)). The server also checks the statuses of the people on the user's subscription list (step (6)), which is a part of the user profile. Distributing presence notification will be described in more detail in Section 5.12. Finally, the server sends the user profile and the statuses of the people on the user's subscription list to the client (step (7)). The login process is now complete. The SSL connection between the client and the server stays up until the user logs off.

Logging out

When logging out, a client sends a logout message to its local server, shown as step (1) in Figure 5.9. The server removes the userID from the Location Service (step (2) in Figure 5.9) and notifies the user's subscribers (step (3)). It removes the user from its database of logged in users. Then it clears the user's profile and breaks the SSL connection with the client.

Figure 5.9: Logging out
\includegraphics[scale=0.5]{log-out.eps}


Chat sessions

To implement chat sessions, I decided to use point-to-point connections. The advantages of this approach are as follow: point-to-point connections reduce the load on the servers, as the clients connect directly to each other. No message has to be forwarded through the servers. The users do not have to trust servers as the servers do not take part in sending messages. Moreover, servers do not know the clients' private keys. The contents of the chat session are therefore protected even against possible malicious servers.

Connection management in multi-party chat session

The only problem with point-to-point connections lies in the implementation of multi-party chat sessions. I considered several possibilities which I describe below:

Figure 5.10: Possible implementations of multi-party chat sessions
\includegraphics[width=\textwidth]{session1.eps}

I selected the first solution for its simplicity. Experience with existing instant messaging systems shows that chat sessions are usually held between users that know each other and that have each other on their subscription lists. To talk with users they do not know, people usually use chatrooms. Users' subscription lists are rather small, 10-20 people, therefore I can assume that the number of participants of a multi-party chat session will be of the similar size.

Session identifiers

Each chat session has a globally unique identifier, SessionID. All messages sent within a session are marked with the SessionID of this session. Also invite, leave and join messages are marked with the SessionID. The SessionID is generated by the user that started the session.

Chat session coordinators

Session membership, that is, the group of participants of a chat session, can change over time. Since any member of a chat session can invite a new member at any time and since members can leave at any time, the members' view of the session membership can become inconsistent. In other words, two different members of a session may have different ideas of who currently belongs to the session. This can happen in two situations: To prevent inconsistencies in the membership view to occur, we must ensure that there is only one inviting member at a time. I have solved this problem by using a session coordinator that holds an invitation token and grants it to only one member of the session at a time. A new member can be invited only by the current token holder. A natural coordinator is the user that has started the session. When the coordinator leaves, however, he must pass the responsibility to another member and tell the remaining members that the coordinator has changed. The invitation procedure is fully described in Section 5.10.5.


Obtaining user's IP addresses

Before inviting a new member to the session, a client must obtain the IP address of the new member's machine. The client contacts its local server and sends it an IP address request. The server checks its database of connected users to see whether the requested user is logged in this server. If not, the server performs a lookup in the Location Service to find the IP address of the server where the requested user is logged in. If the address is found, it forwards the IP address request to this other server. Before the requester is given the IP address, the preferences of the requested user are checked. The IP address is handed over only if the requested user allows it. The actions performed by the server are described in the pseudo-code shown in Figures 5.13.

Figure 5.13: The algorithm for looking up IP addresses of clients
\begin{figure}
\centering
\begin{verbatim}
if (the requested client is logg...
...ge is returned to the requesting client
}
}
}\end{verbatim}
\end{figure}


Inviting

After obtaining the IP address, a user can be invited to the session. The inviting client opens a direct SSL connection with the new member and sends him an invitation message. The invited person can either join the session or reject the invitation. If he joins, the inviting person tries to obtain an invitation token from the coordinator. After the token was granted, the inviting person sends the new member the list of IP addresses of the current members of the session. The new member establishes SSL connections with all the current members. After all connections were set up, the new member notifies the coordinator that the invitation process is completed. The token can be now granted to another member.

The coordinator must not leave while the token is given to some other member of the session. However, he may leave while the token request is on the way, that is, already sent by the inviting member but not received yet by the coordinator. In that case, the inviting member receives from the coordinator a leave message instead of the token. A leave message contains the information who is the new coordinator (the leave procedure is described in more detail in Section 5.10.8). The inviting member resends then his token request to the new coordinator. The algorithm for the inviting person is described in a pseudo-code in the Figure 5.14. The algorithm for the joining member is shown in the Figure 5.15.

Figure 5.14: The algorithm for the inviting member
\begin{figure}
\centering
\begin{verbatim}
send invitation message
wait fo...
...m the coordinator
}
}
}
send the member list\end{verbatim}
\end{figure}

Figure 5.15: The algorithm for the joining member
\begin{figure}
\centering
\begin{verbatim}
send join message
wait for the ...
...ordinator that the invitation
 process is complete\end{verbatim}
\end{figure}

Tickets

While joining a session, the new member sets up SSL connections with other members of the session. He must prove to them that he was really invited to this session. For that purpose, a session ticket is used. A ticket is created by the inviting person and handed over to the new member. The new member shows it to all current members while setting up connections with them. A ticket consists of the following components: The ticket is signed with the private key of the issuer. All the members of the session know the public key of the issuer and can verify the signature. The ticket is therefore a proof that its owner has been really invited to the session with the given ID.

Chatting

Messages exchanged by the members of a chat session are sent along the SSL connections established between the clients. In case of a multi-party chat session, each message is sent many times, once to each member of the session.


Leaving

A member that wishes to leave a chat session sends a leave message to all the other participants. If the leaving person is the coordinator, he must select a new coordinator (he does it at random) and tell all the other members who the new coordinator is. This information is included in the leave message. Then all the SSL connections with the leaving member are broken.


Chatrooms

Chatrooms are the part of the instant messaging system for which servers are absolutely necessary. In the model of my instant messaging system, a chatroom gives a user the possibility of using a nickname and thus hiding his true identity. With point-to-point SSL connections, hiding a user identity is impossible to implement, as while setting up SSL connections both sides have to authenticate themselves. Therefore, to implement a chatroom with nicknames we need servers.

Therefore, to take part in a chatroom, a client communicates with a server. Chatrooms can be implemented either in a centralized or in a distributed way:

Figure 5.16: Chatrooms - centralized vs distributed approach
\includegraphics[width=\textwidth]{chatrooms1.eps}

The first approach, although simple, lacks scalability. The number of participants of a chatroom is limited by the number of connections the chatroom server can accept. Moreover, this solution does not exploit locality. Even if all the current participants of a chatroom are located close to each other, they cannot communicate fast if the chatroom server happens to lie far away from them.

With the distributed approach, the number of users is not limited, as each server maintains connections only with the users that are logged into this server. The distributed approach preserves locality. If all participants of a chatroom are situated close to each other, their local servers lie also near each other. The communication is therefore fast. Taking into account the above remarks I decided to implement chatrooms in a distributed way.

Communication between servers

Having multiple chatroom servers and being able to use only SSL point-topoint connections, we need to decide on the topology of the network between the servers. I considered two alternatives: a static topology and a dynamic one.

Static topology

Figure 5.17: Chatrooms - each pair of servers is connected
\includegraphics[scale=0.5]{chatroom-st1.eps}

The topology of the SSL connections between chatroom servers can be static, that is, not dependent on how many chatrooms there are or where their participants lie. The simplest solution is to connect each pair of servers. This approach is illustrated in Figure 5.17. However, this solution is not scalable. The number of servers in the system is limited by the number of connections one server can set up. There would be a huge number of connections, namely $ \binom{n}{2}$ for n servers. Many of them would never or hardly be used. I abandoned this idea. Another approach is a static spanning tree of connections covering all the servers. The number of connections for n servers would be n-1. This approach is shown in Figure 5.18. In this solution, there are also connections that would not be used.

Figure 5.18: Chatrooms - a spanning tree of connections between servers
\includegraphics[scale=0.5]{chatroom-st2.eps}

Dynamic topology

Figure 5.19: Chatrooms - dynamic topology of connections between servers
\includegraphics[scale=0.5]{chatroom-dt.eps}

With a dynamic topology, the SSL connections between servers are set up when it is necessary. For each chatroom, its participants' local servers build a spanning tree of connections. Only the servers where some participants of the chatroom are logged into are part of the tree. The dynamic topology is shown in Figure 5.19. To join such a tree, at least one server which is a part of this tree, must be located. For this purpose the Location Service is used. Each chatroom has a unique identifier (chatroom ID). Each server taking part in this chatroom inserts its contact address together with the chatroomID into the Location Service. When a new server wants to join a chatroom, it first performs a lookup in the Location Service and finds the nearest server that also implements that chatroom. Then, the new server sets up an SSL connection with the chatroom server it found. Now the new server is part of the tree.

Removing a chatroom server from the tree is more problematic. A server has to remove itself from the tree when none of its clients uses the chatroom any longer. It can be safely done only with the servers that are leaves of the tree. The other intermediate servers must stay in the tree to forward messages sent within the chatroom from one part of the tree to the other. As a result, we may have more servers in the tree than the number of users of the chatroom. This problem can be alleviated by limiting the number of servers that insert their contact address into the Location Service. A server that inserts its address into the Location Service is further referred to as a core server. The other servers do connect to the tree, but do not insert their addresses into the Location Service. Such a server will be further called a proxy server. Since proxy servers cannot be located, no other server can connect to them and they will remain leaves of the tree. The proxy servers can therefore be easily removed from the tree when all their users have left the chatroom. The core servers are likely to stay in the tree for a long time. A chatroom tree with and without limiting the number of core servers are shown in Figure 5.20. A spanning a tree is built for each chatroom. The SSL connections can be reused: if two servers are already connected, there is no need for setting up a separate SSL connection for a new chatroom. However, if the number of chatrooms is huge, the number of connections may be very large. For huge number of chatrooms, static topology is more appropriate. With the static topology the number of SSL connections is independent of the number of chatrooms. The number of connections depends only on the number of servers. However, since chatrooms are created by the administrators of the instant messaging system (see section 5.11.2) we can limit their number. Therefore, for our system, I have chosen the dynamic topology. It does not imply maintaining any unnecessary connections, which is the case with the static topology and it uses the Location Service in an interesting way as a means for finding chatroom servers.

Figure 5.20: Chatrooms - core servers
\includegraphics[width=\textwidth]{core-servers.eps}


Creating a chatroom

Creating a chatroom boils down to creating a chatroomID and publishing it in the Global User Directory. Chatrooms are created by the administrators of the instant messaging system.

Entering a chatroom

To enter a chatroom, a user's client sends its local server a join request containing the chatroomID and the user's nickname for that chatroom. The chatroomIDs can be found in the Global User Directory. If the server already participates in that chatroom, it sends back only an acknowledgment. Otherwise, the server performs a lookup in the Location Service to find the nearest server implementing the chatroom. If the chatroom is found, the user's local server sets up a connection with the server it has found.

Chatting

To chat, a client sends a message to its local server. The message contains the chatromID. The server forwards the message along the edges of the chatroom multicast tree. Every server in the tree delivers the message to its chatroom participants

Whispering

Whispering in a chatroom means sending a message within that chatroom only to a chosen member of the chatroom. Other members do not receive the message. Since servers do not know which user is where, whisper messages are also broadcasted to all the servers participating in the chatroom. A whisper message, however, is delivered only to the chosen user. Other servers simply discard the message.

Leaving a chatroom

To leave a chatroom, a user sends a leave message to its local server. The message contains the chatroomID of the chatroom which the user wants to leave. The server checks whether this was the last user of the chatroom among its clients. If so, the server checks whether it can remove itself from the tree. Removal is possible only when the server is a leaf of the tree. If the server can leave and if it is not a proxy server, it removes its contact address from the Location Service. Afterwards it breaks up the SSL connections with the other chatroom server to which it is connected, provided that those connections are not used by other chatrooms. As a consequence, the other server can become a leaf of the tree. If it does not have any clients using the chatroom, it can remove itself from the tree as well. The procedure goes on recursively.


Presence notification

The presence service delivers presence notifications from publishers to subscribers. In my instant messaging system notifications are distributed via servers, that is, a publisher sends a status update message to his local server. The server distributes the notification among the subscribers' local servers, which in the end deliver the notification to the subscribers themselves. I also considered sending notifications directly from publishers to subscribers, via point-to-point connections, but I quickly abandoned that idea. In the peer-to-peer approach a publisher has to maintain SSL connections with all his subscribers. The number of open connections is limited by the operating system, which would limit the number of subscribers per publisher. This solution is thus not scalable. The other disadvantage is that the connections would be used inefficiently. There would be a very large number of them (one for each pair publisher-subscriber) but the amount of data sent would be rather small, a couple of bytes sent only a few times a day.

Communication between servers

Since the use of a peer-to-peer model for distributing presence is not suitable, I decided to shift this responsibility to servers. In this section, I discuss the problem of how servers should communicate with each other to implement distributing of presence notifications in a scalable fashion. As noted earlier, all the communication within our instant messaging system is implemented by means of SSL. Here, I discuss several topologies of SSL connections.

Short-term SSL connections

In the short-term SSL connections approach, the publisher's and the subscriber's servers set up a connection only to send a single notification. After it has been sent, the connection is immediately broken. The disadvantage of this solution is that setting up an SSL connection is expensive and the amount of data to be sent small. The advantage, however, is that SSL connections are not maintained and therefore the number of subscribers (which determines the number of SSL connections to be set up) does not matter. This solution is appropriate when changes in a user's status occur rarely. In particular, when there are only two statuses, namely ``online'' and ``offline,'' this solution seems to be quite reasonable. There are at most three messages sent between subscriber and publisher: one when the publisher becomes online, one when the subscriber becomes online and one when the publisher becomes offline. The period between those two messages will be usually a couple of hours. Therefore, it is not useful to maintain an SSL connection between servers only to send those three messages.

Dynamic topology

Another approach is to let servers dynamically build a separate spanning tree for each publisher. To build a tree, servers use the same algorithm as in the case of chatrooms. Initially, there is one core server - the publisher's server. That means that each subscriber's server connects directly to the publisher's server. The resulting tree is a one-level tree with the publisher's server as the root and the subscribers' servers as leaves. When the number of subscribers grows, the number of connections the publisher's server must handle grows as well. It may become too large. When the publisher's server has too many connections, the number of core servers is increased and part of the overloaded server's connections is handed off to the new core server. We subsequently do so for any overloaded core server. The new core server is elected from among the proxy servers connected to the overloaded core server. An important and still unsolved problem here is which proxy server to chose. Another problem is how to hand off SSL connections. One possible solution is to make the new core server insert its contact address into the Location Service and then make all the proxy servers connected to the overloaded server break their connections and again look in the Location Service for the nearest server. If the proxy servers are distributed evenly, for part of them the new core server will be now the nearest core server, as shown in Figure 5.21. How big that part will be is strongly related to which proxy server is elected to be a new core server. In the worst case, (almost) all proxies will connect again to the overloaded server. When a server notices that its clients are no longer interested in receiving presence notifications (of a given user), it tries to remove itself from the spanning tree. Removal is possible, however, only when the server is a leaf of the tree, that is to say, when it is a proxy server or core server with no proxies connected. Otherwise it has to remain in the tree in order to forward the presence notifications.

When the publisher becomes offline, the notification is broadcasted along edges of the tree. Next, the tree is removed. The core servers remove their addresses from the Location Service.

Figure 5.21: Increasing the number of core servers
\includegraphics[width=\textwidth]{inc-core-servers.eps}

The main advantage of the dynamic topology solution is its simplicity as long as the number of subscribers is not too large to be handled by one core server. With many subscribers, the algorithm becomes complex and expensive. When the number of subscribers is moderate, the subscribers' servers set up direct connections with the publisher's server. That allows us to avoid problems with removing servers from the tree. In addition, the presence notifications are sent directly from the publisher's to the subscriber's server, without routing it through intermediate servers. When the number of subscribers grows and the one-level tree turns into a many-levels tree, the problem with removing servers from the tree and the problem of routing a notification through intermediate servers emerge again. On the other hand, the direct connection between a subscriber's and the publisher's server may be seen as a disadvantage since the size and frequency of messages exchanged by those two servers are rather small. It is not really worth maintaining an SSL connection to send only a couple of bytes once or twice in an hour.

Static topology - a spanning tree

Yet another approach is to have all servers in the system build a global, shared, static spanning tree of SSL connections. The presence notifications are distributed along the edges of that tree. The algorithm of building such a tree is similar to the algorithm of building trees for chatrooms. All servers insert their contact addresses into the Location Service. When a new server comes up, it looks up the address of the nearest other server and sets up a connection with it. One of the disadvantages of the static topology solution is that a message may be routed through many servers before it reaches its destination. Additionally, as the static topology cannot adapt itself to changing conditions, some of the SSL connections can be hardly used while other can be quite heavily loaded. Nevertheless, since the amount of data concerning one publisher is small, this load-unbalance is not that serious. The main advantage of the static topology solution is that the number of SSL connections does not depend on the number of subscribers. Therefore, extremely popular users (those with many subscribers) will not cause problems with too many SSL connections per server. Another advantage is sharing SSL connections. One connection will be used to send notifications from many publishers. The amount of data sent through it will be sufficient to justify its existence. Taking into account all the above remarks, I decided to choose the static topology solution to implement in my instant messaging system.

Delivering presence notifications

A publisher becomes online

Figure 5.22: A publisher becomes online
\includegraphics[scale=0.5]{publisher-online.eps}

When a publisher becomes online, his local server broadcasts his presence information along the spanning tree. When the notification has reached all parts of the tree, the servers that are not interested in receiving notifications from this publisher send back a pruning message. A server is not interested when neither any of its clients nor any of its children in the tree is interested. This procedure eliminates branches of the tree that do not lead to any subscribers. The working of the algorithm is illustrated in the Figure 5.22. Not interested servers store the information from which neighbour in the tree the notification came from. This information is useful when a new subscriber logs in and the server becomes interested. For each publisher, servers store the following routing information:

Note, that each server in the tree knows all online users in the system.

A publisher's status changes

Figure 5.23: A publisher's status has changed
\includegraphics[scale=0.5]{publisher-status-changed.eps}

When a publisher's status changes, a notification is routed along the tree to all interested servers. This situation is illustrated in the Figure 5.23.

A publisher becomes offline

Figure 5.24: A publisher becomes offline
\includegraphics[scale=0.5]{publisher-offline.eps}

When a publisher becomes offline, a notification is broadcasted along the tree to all servers. The routing information stored on servers concerning this publisher is cleared. This situation is shown in the Figure 5.24.

A subscriber becomes online

Figure 5.25: A subscriber becomes online
\includegraphics[scale=0.5]{subscriber-online.eps}

When a subscriber becomes online, for all the publishers the subscriber subscribes to, the subscriber's local server checks whether the publisher is online. If it is necessary, the server sends an unpruning message and starts receiving presence notifications from publishers, for which it had previously indicated no interest. This situation is shown in the Figure 5.25.

A subscriber becomes offline

Figure 5.26: A subscriber becomes offline
\includegraphics[scale=0.5]{subscriber-offline.eps}

When a subscriber becomes offline, for each publisher the subscriber subscribes to, the servers checks whether this is the last subscriber to the given publisher at his server. If so, the server sends a pruning message and stops receiving presence notifications from that publisher. This situation is shown in Figure 5.26.

Subscribing to presence information

When a user wants to subscribe to the presence information of some publisher, he first contacts his local server. The local server contacts the publisher's local server or the publisher's home server (depending on whether the publisher is online or offline) which stores the publisher's preferences. The user's local server can be found by performing a lookup in the Location Service. The user's home server's IP address is a part of the user handle of that user and is used if no address is found in the Location Service (i.e. the user is offline). After contacting one of those servers, the subscriber's server checks the publisher's preferences to see whether the subscription can be made. If so, the user is added to the publisher's list of subscribers and the subscription is added to the list of user's subscriptions. If the publisher is online, the server sends an unpruning message if necessary and starts receiving notifications from this publisher.

Unsubscribing to presence information

To unsubscribe, a user contacts his local server. The server contacts the publisher's local server or the publisher's home server. The contacted server removes the subscription from the publisher's list. The subscriber's local server removes the subscription from the subscriber's list. If necessary, the subscriber's local server sends a pruning message and stops receiving notifications.

Canceling a subscription to presence information

A subscription to some publisher's presence information can be canceled by the publisher himself. To cancel a subscription, a publisher contacts his local server. The server contacts the subscriber's local server or the subscriber's home server. The contacted server removes the subscription from the subscriber's list. The publisher's server removes the subscription from the publisher's list. The subscriber is notified that his subscription was canceled. The subscriber's server sends a pruning message if necessary.

Changing user preferences

If a user wants to change his preferences concerning who can subscribe to his presence, he contacts his local server. The server changes the preference list. Changing preferences may result in canceling some subscriptions. The server cancels the subscriptions automatically.

Security

One of the main goals of this project was to create a secure instant messaging system. In Chapter 3, I defined a set of requirements for a secure instant messenger. In the following section, I discuss whether and how those requirements are met in my instant messaging system. The security in my instant messaging system relies mostly on SSL. SSL provides three basic security services: authentication, integrity and confidentiality. Authentication in SSL in based on X.509 certificates. Each server in the system has such a certificate and a private key that corresponds to the public key contained in the certificate. Using the certificate and the private key, a server can prove to its clients and other servers that it is a genuine server. Each user of the system also has a certificate and a private key. By means of the certificate and a key, a user can prove to a server and other users that he is a legitimate owner of his user handle. SSL guarantees confidentiality and integrity of messages by means of digital signatures and encryption. SSL uses strong and well tested cryptographic algorithms. This gives the users guarantees that the messages they receive arrive unmodified and that they cannot be read by unauthorized persons. An important issue is that security in my instant messaging system relies to a great extent on servers. A lot of sensitive information flows through servers, such as presence information and chatroom messages. Servers are also responsible for granting access to the user profiles they store. Therefore, to gain trust into the system, the users must trust servers. Moreover, a user must not only trust his local or his home server, but all the servers. This is an implication of the algorithms used to distribute presence notifications and to implement chatrooms. A malicious server could, for example, disclose the presence information that it routes to unauthorized persons. Only in the case of chat sessions, users do not have to trust servers completely. In this case, servers only distribute users' IP addresses. The communication between users is implemented by means of point-to-point connections and therefore a server has no access to the information exchanged during a chat session.

Such a big responsibility put on servers requires that only trusted institutions should be allowed to run an instant messaging server. The server-side software can be distributed freely but the certificates for servers must be granted with a great care.

Hereafter, I describe how the security requirements defined in Chapter 3 are met in my instant messaging system.

Presence service

Subscriptions

Strong authentication mechanisms give a user guarantees that only the persons he authorizes can subscribe to his presence information. The system verifies the identity of a person trying to subscribe and checks it against the user's preferences. The system delivers presence notifications only to the legitimate subscribers. Confidentiality of messages guarantees that only the subscribers are able to read the notifications. No unauthorized person can access a user's preferences concerning presence subscriptions. Authentication also guarantees that nobody can subscribe, unsubscribe or cancel a subscription on behalf of another user. A user can be prevented from performing one of those actions only by breaking the connection between his client and server or by forcing his client or server to crash. SSL guarantees integrity of messages, so the user cannot be prevented from subscribing, unsubscribing or canceling a subscription by modifying messages sent between his client and server. Confidentiality of messages sent between the client and the server ensures that no third party can learn who a user is subscribing to and who is subscribing to him.

Notifications

Confidentiality, integrity and mutual authentication of communications within the system guarantee that a notification is read only by its intended recipients and that nobody can modify or fabricate a notification. Spamming with unwanted notifications is impossible, as notifications are accepted only from a genuine server. The server must authenticate itself to the client, so nobody can pretend being a server.

Instant messaging

Messages

SSL ensures confidentiality, integrity and mutual authentication of communication between clients, therefore messages cannot be intercepted, modified or fabricated.

Chat sessions

To ensure that a user joins a session only if he was invited to this session, SSL authentication is not sufficient. A user could set up connections with all current members of a session and to each of them pretend that he was invited by some other member. Therefore, I introduced session tickets. A session ticket for a new member is issued by the inviting member of the session. The new member presents it to the rest of the participants of the session to prove that he was legitimately invited. Integrity and authentication in SSL guarantee that invite, join, reject and leave messages cannot be modified or fabricated. Nobody can invite to, join or leave a session on behalf of another user. In the peer-to-peer model, it is impossible to ensure that nobody can learn about a user's inviting to, joining or leaving the session. The messages themselves cannot be intercepted but it is possible to trace with whom a user sets up SSL connections and deduct from that what kind of actions he takes. Messages from non-members of a chat session sent within this session (marked with the ID of this session) are not accepted by the members of the session. Since messages are all authenticated, it is impossible to forge a message with the identity of one of the members of the session and send a message within that session on behalf of that member. To send an invitation to a user, it is necessary to obtain his IP address from his server. The server gives the user's address only to the persons authorized by the user. However, if somebody manages to obtain the user's IP address in some other way, he can try to set up a connection with this user and send him an invitation. Such an invitation is immediately discarded by the user's client and the connection broken.

Chatrooms

When a user enters a chatroom, he sends the server a message containing his nickname. This message is protected against eavesdropping by encryption, so only the server knows the binding between the user's nickname and his user handle. Assuming that the server is trusted, nobody can connect a nickname with the true identity of the user.

Possible attacks

Attacks on the Location Service

The Location Service is not secured against illegal insertion, removal or lookups. With an illegal lookup, an unauthorized person can discover whether a user is online or offline. If the lookup returns any contact address, the user is online, otherwise the user is offline. It is , however, impossible to obtain a user's IP address by an illegal lookup. The Location Service stores the address of the client's local server instead of his real IP address. Thus, it is possible to trace a user only to his local server.

Illegal insertions and removals in the Location Service result in some inconvenience for users. If a user's address is removed from the Location Service, he cannot be located by other servers than the server he is logged into. Hence, he cannot be contacted by users logged into other servers. Illegal insertion can also prevent a client from being located. The illegally inserted address can be the address of either a server at which the client is not logged into or of a non-existent server. In both cases, a server that obtains the illegally inserted address as a result of a lookup is unable to locate the client.

Denial-of-service attacks

Any networked computer is vulnerable to denial-of-service attacks. In my instant messaging system, denial-of-service attacks threaten both servers and clients. This risk is slightly alleviated by the fact that only a genuine server or client can authenticate itself and set up a connection with other components of the system. Any other connection is refused. A client, however, may be malicious despite the fact that it has a valid certificate. Therefore, once a client has set up a connection with a server or other client, it can send only messages expected by the server or other client. If it sends wrong or malformed messages, the connection is immediately broken. Nevertheless, even when a client sends only correct messages, it may still constitute a threat. A client can, for example, log in a server and flood it with subscription requests. It can also abuse other user by setting a large number of connections and sending an invite message along each of them. One way of protecting a user against denial-of-service attacks is hiding his IP address. A user's IP address is only known to the server the user is logged in. The server does not insert the client's IP address into the Location Service. It inserts it own IP address instead. If someone wants to obtain the user's address, he must first contact his local server. Before returning the requested IP address, the server checks the user preferences. The IP address is given only to the persons authorized by the user.

Scalability

In this section I discuss, how the instant messaging system I designed meets the scalability requirements presented in Chapter 3. I wanted to create a system that is scalable in two ways: with respect to the number of users, that is, able to handle very large number of users; and geographically, capable of coping with large geographical distribution of its components. Scalability with respect to the number of users is achieved by using multiple servers. The number of users that can be handled by one server is limited, but the number of servers is not. Communication between servers and managing user profiles are designed in such a way that they do not hinder the scalability of the system. Routing presence notifications may constitute a problem, however. Servers need to store routing information for distributing presence notifications. The amount of data stored by one server is dependent on the total number of users in the instant messaging system. On the other hand, the amount of data per user is not much - a couple of bytes - so even with millions of users, servers will need to store only a couple of megabytes of data. Using multiple servers allows us to achieve geographical scalability as well. A geographically dispersed group of users can be coped with by distributing the servers over the same area. Each of the users can then find a server in their vicinity. The communication between clients that lie close to each other is kept local. In the case of a chat session this is obvious, because they use point-to-point connections. The algorithm of distributing messages in chatrooms is designed so that only local servers are involved. The communication is thus also kept local.


Implementation

In order to validate my ideas I implemented an instant messaging system complying to the set of requirements stated in Chapter 2. I concentrated on the instant messaging service. I consider a chat session the most important feature of the instant messaging service, since all existing instant messengers implement this function. Chat sessions are also the most interesting feature in my model, as they require communication between all components of the system: server-server and client-server communication for passing clients' IP addresses and client-client communication for sending messages. Those communications require implementing secure connections between all components of the system. Moreover, chat sessions take advantage of the Location Service. My instant messaging system was intended as an application for the Location Service. Implementing only two-party chat sessions, like most of the existing instant messengers do, is not enough. Multi-party chat sessions introduce the interesting security problem of accepting a new member of a session, who must prove that he was legitimately invited. For that purpose, session tickets were introduced. We also face a mutual exclusion problem when inviting new members. To solve this problem, I implemented a distributed mutual exclusion algorithm using a coordinator and a token. Multi-party chat sessions are thus the most challenging feature in my instant messaging system model. To implement all the security requirements, I have to give the users the possibility of defining their privacy preferences. Therefore, I needed to implement managing user profiles. I used the home server approach. I decided not to implement chatrooms. Chatrooms are an extension of instant messaging systems, far less important than chat sessions. Many existing instant messengers do not implement chatrooms. I also decided to leave out the Global User Directory. This additional service of an instant messaging system is useful but not absolutely necessary. The client-side software has a very simple graphical interface. I decided to omit some client-side features, such as multipart messages or launching external applications to view URLs contained in messages. Those features would only complicate the user interface while working on the user interface of an instant messaging client was not my primary goal. Instant messaging does not make sense without presence service. Nevertheless, for the sake of simplicity, I decided to implement only a simple version of this service with a centralized architecture. The functionalities of the presence service is also slightly reduced. My presence service distinguishes only two statuses: ``online'' and ``offline.'' A larger variety of statuses can be easily implemented in the presence service but it would require enhancing the client-side software.

A simple presence service

The presence service is implemented in a centralized way. A single dedicated presence server maintains information on all the subscriptions. The presence server is in charge of storing the subscription lists and presence preferences of all the users in the system. The presence server maintains a list of all currently online users. For each user, it remembers at which server the user is logged in. All other servers in my system, for the sake of clarity further referred to as instant messaging servers, maintain SSL connections with the presence server. The organization of the components of the system is shown in Figure 6.1.

Figure 6.1: The organization of the components of the system
\includegraphics{simple-presence-design.eps}

The clients do not know about the existence of the presence server. They communicate only with their local instant messaging servers. From their point of view, the presence service works as if it was implemented by a static spanning tree of connections between instant messaging servers.

Communication

The communication between instant messaging servers and the presence server is implemented by means of SSL. The clients do not communicate directly with the presence server, but only through instant messaging servers. The presence server has a certificate issued by the system Certification Authority. It uses this certificate to authenticate itself to the instant messaging servers. Instant messaging servers authenticate themselves to the presence server using their usual certificates.

Distributing presence notifications

A user becomes online

When a user becomes online, he logs in at his local instant messaging server. The instant messaging server notifies the presence server which adds the user to its database. Then, the presence server sends notifications to the local instant messaging servers of all subscribers of that user. Each local server forwards the notifications to his clients. The presence server also checks the statuses of all the people on the user's subscription list and sends that information back to the user via his local server.

A user becomes offline

When a user becomes offline, he logs out at his local instant messaging server. The instant messaging notifies the presence server. The presence server removes the user from its database and notifies the subscribers via their local servers.

Subscriptions

When a user wants to subscribe to a certain publisher, he sends a request to his local instant messaging server. The request is then forwarded to the presence server. The presence server checks the publisher's preferences. If the subscription is allowed, the presence server adds it to the database of subscriptions and sends back an acknowledgment. If the subscription is not allowed, the presence server sends back an error message. The acknowledgment or error message is then forwarded to the user. When a user wants to unsubscribe, he sends an unsubscribe request to his local instant messaging server. The local server forwards the request to the presence server. The presence server removes the subscription from the database of subscriptions.

Changing preferences

To change his preferences, a user contacts his local instant messaging server and sends it a profile-update request. The instant messaging server forwards the request to the presence server which updates the user's data. Changing presence preferences results in canceling the subscriptions that are not allowed anymore by the updated preferences. The presence server sends cancellation notifications to those subscribers via their local servers.

Figure 6.2: The organization of the components of the client
\includegraphics[width=\textwidth]{client.eps}

The Client

The client has a layered architecture. We can distinguish three layers. The communication layer provides basic communication facilities. The functional layer implements the core presence and instant messaging functions, such as ``subscribe'' or ``invite.'' Finally, the Graphical User Interface layer provides the user with an interface to the system. The organization of the client is shown in Figure 6.2. The arrows indicate the flow of control between components. The next section describes the three layers and their components in more detail.

The communication layer

The communication layer is multi-threaded. A listener thread listens for incoming connection requests. For each established connection, it launches a connection thread that waits for incoming messages at that connection. For each type of message, it invokes a function from the functional layer. There are two types of connection threads, one for the connection with a server and one for the connections with other clients.

The functional layer

The functional layer consists of various components implementing the core functionalities of the client.

Chat session component

The chat session component provides functions implementing chat sessions. It maintains a list of all sessions the user is taking part in. For each session, it stores its SessionID, coordinator and member list. There is also a list of pending sessions, that is sessions, to which the user was invited but has not decided yet whether to join or reject the invitation. For each pending session, the client stores its SessionID and the invitor's user handle. A part of the functions implemented by the chat session component are invoked by the communication layer as reactions to incoming messages. Other part is invoked by the GUI layer as reactions to the actions of the user. Table 6.1 summarizes the chat session component functions and indicates by which layer those functions are invoked.

Table 6.1: The functions implemented by the chat session component
  communication layer GUI layer
send a message within a session - +
receive a message within a session + -
invite a new member to a session - +
receive an invitation + -
leave a session - +
join a session - +
reject an invitation - +
add new member to a session (invited by some other member) + -
remove a member from a session (when he leaves) + -
grant the session token + -
receive the notification that the invitation process is finished + -


The chat session component communicates also with the user profile management component. It checks user instant messaging preferences when an invitation to a session arrives. If the inviting person is not allowed to invite the user, the invitation is immediately rejected.

User profile management component

The user profile management component stores the user presence and instance messaging preferences. It contains functions to check and update those preferences. When a user wants to update his preferences, it takes care of sending update requests to the server.

Presence component

The presence component implements the presence service functionality. It stores the subscription list of the user. For each publisher on the list, it stores also his current status. The Table 6.2 summarizes the functions implemented by the presence component and shows which layers invoke those functions.

Table 6.2: The functions implemented by the presence component
  communication layer GUI layer
subscribe - +
unsubscribe - +
receive a presence notification + -
receive a canceled subscription notification + -


User registration and authentication component

The user registration and authentication component implements the following functions: While logging in, the user profile (preferences and subscription list) is retrieved from the server. The user preferences are then delivered to the user profile management component and the subscription list is delivered to the presence component.

Address book component

An address book contains a mapping between user handles and human-friendly names of other users. If a user handle is in the address book, the client uses the corresponding name to display it to the user, for example in the graphical representation of his contact list. Otherwise, it displays a text representation of the user handle. The address book component stores the address book. It implements functions to read and update the address book. It can also read the address book from a file and write it to a file.

The Graphical User Interface layer

The user interface of the client consists of different windows. The most important are the main window (shown in Figure 6.3) and the session window (shown in Figure 6.4).

Figure 6.3: The main window of the client
\includegraphics[scale=0.6]{mainframe.ps}

The main window contains the graphical representation of the user's subscription list and the statuses of the people from that list. The menu gives the user access to the functions of the application. In the lower part of this window, the user can see how many invitations to chat sessions are waiting for his response (pending sessions). By clicking the ``get next'' button, the user can view the subsequent invitations and respond to them. Joining a session launches a separate session window.

Figure 6.4: The session window of the client
\includegraphics[scale=0.6]{sessionframe.ps}

A session window is opened for each chat session. It contains the list of the current members of the session and the list of messages sent within the session. The user can type his messages in the lower part of the window. Other elements of the user interface include: user preferences dialog, subscription list dialog and address book dialog.

An example flow of control

This section shows the flow of control in the system while handling an invite message. The flow is shown in Figure 6.5.

Figure 6.5: Handling an invite message
\includegraphics[width=\textwidth]{client-flow.eps}

When a message arrives it is first processed by the SSL library (step (1) in Figure 6.5). The message is decrypted and its authenticity and integrity is checked by verifying the digital signature of the sender. Next, a connection thread is woken up to handle the message (step (2)). The connection thread invokes an appropriate function from the higher layer. For an invite message, it invokes the receive-invitation function implemented by the chat session component (step (3)). Before accepting an invitation user instant messaging preferences must be checked. The check-preferences function implemented by the user management component is invoked (step (4)). If the user preferences do not allow the invitation to be accepted, the reject function is invoked (step (5)) and a reject message is sent to the inviting party (step (6)). Otherwise, the main window is notified (step (7)). A user hears a beep sound and sees the number of pending sessions increase.

Implementation details

The client was implemented using Java 1.3 to achieve platform independence. The user interface was implemented using Java Swing. I use the 1.0.2. version of the standard Java implementation of SSL called JSSE [4].

Figure 6.6: The organization of the components of the instant messaging server
\includegraphics[width=\textwidth]{server.eps}

The instant messaging server

The organization of the instant messaging server is shown in the Figure 6.6. The implementation of the server can be divided into two layers. The communication layer provides the basic communication facilities. The functional layer contains functions implementing the basic functionalities of the instant messaging server, such as managing user profiles, managing users' IP addresses, distributing presence notifications.

The communication layer

The communication layer is multi-threaded. A listener thread listens for incoming connection requests. For each established connection, the listener thread launches a connection thread that waits for incoming messages at that connection. There are three types of connection threads: for connections with other instant messaging servers, for connections with clients and for the connection with the presence server. For each type of message, a connection thread invokes a function from the higher layer.

The functional layer

The functional layer consists of various components implementing the functionalities of the instant messaging server.

User list

The instant messaging server maintains a list of all the users currently logged into this server. For each user, the server stores his IP address and port number, user handle and user profile. The user profile stored by the instant messaging server contains only instant messaging preferences. Since the presence service is implemented by means of a separate presence server, the parts of user profiles concerning presence (subscription lists and presence preferences) are stored by the presence server.

User registration component

The user registration component is responsible for creating user accounts i.e. registering new users. It communicates with the Location Service to generate user handles. For each new user, a default user profile is created. The user registration component therefore communicates with the user profile management component.

User profile management component

The user profile management component manages user profiles. It contains functions to check and update user profiles. It can load a profile from the database or download it from another instant messaging server, namely the home server of the user. When the profile is updated, the profile management component takes care of updating the database or sending an update request to the server from which the profile was downloaded.

User IP address management component

The user IP address management component handles the requests for IP addresses issued by clients and other instant messaging servers. For a client-originated request, it first searches in the user list. If the user is not found at the user list, the user IP address management component performs a lookup in the Location Service and contacts another instant messaging server to retrieve the IP address. For an incoming server request, the user IP address management component searches only in the user list.

Log in/log out component

The log in/log out component handles logging in and out of users. It adds and removes users to the user list. It inserts and removes user handles at the Location Service. It notifies the presence server when a user logs in or out. Finally, it communicates with the user profile management component to retrieve a profile and send it to the user when he logs in.

The presence component

The presence component handles all the messages concerning the presence service: subscribe/unsubscribe requests, presence notifications and presence preferences updates. It actually only forwards the messages since the presence service is implemented by means of a dedicated presence server. The messages from clients are forwarded to the presence server and the messages from the presence server are forwarded to the clients.

Database connection component

The database connection component implements a simple interface for accessing a database. The instant messaging server uses a MySQL database for storing user profiles. The database connection component provides functions to query and update the database.

The Location Service connection component

The Location Service connection component implements an interface for accessing the Location Service. It implements functions for inserting, removing and looking up contact addresses.

Implementation details

The server was implemented using the same Java 1.3 + JSSE 1.0.2 technology as for the client. In addition, the server uses a MySQL database [8] and MM.MySQL [6] driver to access it. For generating X509 certificates I use the cryptographic library by the Legion of Bouncy Castle [5].

Figure 6.7: The organization of the components of the presence-server
\includegraphics[width=\textwidth]{presence-server.eps}

The presence server

The organization of the components of the presence server is shown in Figure 6.7.

The communication layer

The communication layer is implemented in the same multi-threaded way as in the case of the client and the server. A listener thread listens for incoming connection requests. For each established connection, it launches a connection thread that waits for incoming messages at that connection. For each type of message, a connection thread invokes a function from the higher layer.

The functional layer

The functional layer is divided into four parts: the user list, the presence notification component, the subscription component and the user preferences component.

The user list

The presence server maintains a list of all online users. For each user it stores the following information: user handle, current status, subscription list and at which server the user is logged in.

The presence notification component

The presence notification component takes care of distributing presence notifications among the online users. It also adds the users to the user list and removes them from the user list.

The subscription component

The subscription component handles subscribing and unsubscribing. It updates the database and the subscription lists stored in the user list. It communicates with the user preferences component to check the user preferences before any subscription is made.

The user preferences component

The user preferences component implements functions to check and modify user preferences. It communicates with the database. It handles user update-preferences requests forwarded by instant messaging servers.

Implementation details

The presence server was implemented using the same technologies as the instant messaging server: Java 1.3 + JSSE 1.0.2 and MySQL database + MM.MySQL driver.


Future work

My instant messaging system in its current implementation suffers some limitations. The most important issue to address in the future is to implement the presence service in a scalable way. The current presence service implementation lacks scalability, as it is implemented by a single, dedicated presence server. Moreover, the presence server is implemented in an inefficient way in Java and makes use of an inefficient, non-commercial database to store the user information. The implementation of SSL is also inefficient.

The presence service, as it is implemented now, works well with a small test cases, but I believe that even a small fraction of the load that is put on the most popular instant messaging systems, such as ICQ or MSN Messenger, would be completely devastating for my instant messaging system. To solve this problem two things can be done. Firstly, the presence service can be implemented in a distributed way, as I propose in Chapter  5. Secondly, the presence server can be implemented in a highly efficient way with performance and not just functionalities in mind. It would require using a high-quality, commercial database to quickly handle requests. It would also require switching to a more efficient implementation of SSL. Another issue that should be addressed are denial-of-service attacks. In the current implementation, all components of the system are vulnerable for that kind of attacks. Denial-of-service attacks are especially dangerous when the attack is targeted towards one of the servers. Bringing down an instant messaging server will make using the system impossible for all users, whose profiles are stored on the attacked server. Bringing down the presence server would block the whole system.

One of the causes of the vulnerability to denial-of-service attacks lies in the inefficient implementation of servers. The profile database is probably the most vulnerable part of a server. A client can easily bring an instant messaging server down by flooding it with update-profile requests. All those update requests must be processed by the database and would quickly lead to its overloading. Another cause of the vulnerability to denial-of-service attacks is the fact that in the current implementation, the instant messaging server does not even try to detect when such an attack might be taking place. A server could, for example, measure the frequency at which a client sends his requests and if this frequency exceeds a certain limit, drop the connection with the malicious client. The Location Service is used by our instant messaging system for locating users, servers and chatrooms. Unfortunately, the Location Service is not secured against illegal insertions, deletions and lookups of addresses. The users' IP addresses are protected by the fact that only the servers addresses are inserted into the Location Service. Nevertheless, an unauthorized person can discover whether a user is online or offline by simply performing an illegal lookup in the Location Service. Moreover, an illegal insertion or deletion can prevent a user from being contacted by other users. In the future, the Location Service should therefore be secured against illegal execution of operations. This can be achieved through authentication procedures. The communication between instant messaging servers and the Location Service servers must also be secured by means of SSL. The future implementation of the instant messaging system should include the functionalities that were omitted in the current implementation, namely chatrooms and the Global User Directory. Chapter 5 outlines how those features could be implemented. However, the design of the Global User Directory, as it was outlined in Chapter 5, lacks scalability. Using a single server for storing data concerning all users of the system limits the total number of users in the system. This problem can be solved by using a cluster of servers instead of a single server.

The functionalities of the presence service must also be enhanced. The current implementation allows only two statuses of a user (``online'' and ``offline''). More statuses should be added, such as ``away'' or ``do not disturb''. Moreover, although it has been defined in our instant messaging model, in the current implementation, a publisher cannot explicitly cancel a subscription. A subscription can be only canceled as a result of changing the publisher's preferences. The possibility of canceling subscriptions should be included in the future implementation.

The future implementation can also support more types of messages. Right now, the users of my instant messaging system can only send plain text messages. Possible extensions include support for URLs that can be recognized by the client-side software and marked out in the text displayed to the user. A user could also be given the possibility of launching external applications, such as a browser or a PDF viewer, to view the URLs. A multipart mime-typed message is another extension of a plain text message. Finally, the user interface of the client should be enhanced. Right now, the graphical user interface is very simple and sometimes inconvenient. It is also not very attractive visually. One could easily imagine implementing themes and skins for my instant messaging system, making it more attractive visually.


Conclusion

My project concentrates on creating an instant messaging system that is both secure and scalable. An additional goal is to define a basic set of functionalities that an instant messaging system has to provide, while avoiding overloading it with the bells and whistles from commercial systems. The study of the state of the art in instant messaging gives a reasonable insight of which functionalities are essential. Among this set, I find instant messaging itself, under the form of chat sessions or chatrooms, and notification of the users' presence into the system. In order to avoid pitfalls of existing instant messaging systems, my system complies to a set security and scalability requirements. Namely, for security, all the communications are encrypted and the messages are authenticated. With respect to scalability, emphasis is put on geographical scalability, the instant messaging systems servers and clients are distributed over a wide area; and on scalability in number of users, the number of clients the system can support is large.

Based on this preliminary study, I designed an architecture which can cope with the required functionalities and meet the security and scalability requirements. The security architecture was implemented using SSL, which provides strong encryption and authentication. The scalability of the system was achieved through the distribution of its components. The use of a peer-to-peer model between the clients participating in chat sessions also enhance the scalability of my instant messaging system. By designing my instant messaging system I showed that combining security and scalability in an instant messaging system is indeed feasible.

As a proof of concept, I implemented selected functions of the instant messaging system using Java and SSL. The current implementation includes a simple presence notification service and provides instant messaging service under the form of chat sessions. Chat sessions are the most representative functionality of an instant messenger. In my model, this functionality turned out to be the most challenging to implement. It requires implementing the whole security architecture since it requires communication between all the components of the system.

In the actual implementation, the presence service lacks scalability. For the sake of simplicity, I decided to implement the presence service in a centralized way. However, I believe that the presence service can be made scalable by implementing it with multiple servers connected with a spanning tree of SSL connections.

The current implementation also comes up short on server-side denial-of-service attacks issues. This problem can be alleviated by implementing servers in a more efficient way.

My instant messaging system is not available for public use yet. Before making it public, the implementation must be enhanced. Apart from implementing the presence service in a scalable way, the user interface must be enhanced and some missing functionalities such as chatrooms and Global User Directory must be added. Afterwards, my instant messaging system will be released for public use.

Bibliography

1
Gadu-gadu website, http://www.gadu-gadu.pl.

2
Icq servers, http://www.emergency-world.com/icqhelpnet/servers/.

3
Icq website, http://www.icq.com.

4
JSSE website, http://java.sun.com/products/jsse/.

5
Legion of the bouncy castle website, http://www.bouncycastle.org.

6
Mm.mysql website, http://mmmysql.sourceforge.net.

7
MSN Messenger website, http://messenger.msn.com.

8
Mysql website, http://www.mysql.com.

9
J. Abney and J. Frey, Secure Instant Messaging.

10
S. Dault, Encryption and checkcode of the icq protocol v5, July 1998, http://www.algonet.se/henisak/icq/encrypt-v5.txt.

11
H. Isaksson, Version 5 of the icq protocol, April 2001, http://www.algonet.se/henisak/icq/icqv5.html.

12
M. Day, J. Rosenberg, and H. Sugano, A Model for Presence and Instant Messaging, RFC2778, IETF, January 2000.

13
M. Day, S. Aggarwal, G. Mohr, and J. Vincent, Instant Messaging / Presence Protocol Requirements, RFC2779, IETF, February 2000.

14
R. Movva and W. Lai, MSN Messenger Service 1.0 Protocol, Internet Draft, IETF, 1999, http://www.hypothetic.org/docs/msn/ietf_draft.php.

15
T. Muldowney and E. Landrum, The Jabber Programmers Guide, 2001, http://docs.jabber.org/jpg/.

16
P. Saint-Andre, Jabber Technology Overview, March 2001, http://docs.jabber.org/general/html/overview.html.

17
W. Stallings, Network and internetwork security, Prentice Hall, 1995.

18
A. S. Tanenbaum and M. van Steen, Distributed systems, principles and paradigms, Prentice Hall, 2002.

19
M. van Steen, F. J. Hauck, P. Homburg, and A. S. Tanenbaum, Locating objects in wide-area systems, IEEE Communications Magazine (1998), 104-109, ftp://www.cs.vu.nl/pub/papers/globe/commag.98.ps.

About this document ...

A Secure Instant Messaging System

This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.47)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split 0 thesis6.tex

The translation was initiated by M. Wrzesinska on 2002-10-22


M. Wrzesinska 2002-10-22