Clustering

Vaishali Tapaswi

Why Clustering

With internet becoming mode of business, maintainability and high availability are becoming necessities for sites. If you notice, recently 'Google' or 'Twitter' services going down has become a news item. Protecting our site from attacks and keeping it available irrespective of increased number of users or increased amount of data transfer is just mandatory. Generally, to achieve high availability and scalability we need some hardware redundancy, which means we will have more than one server for most of the sites. We will be creating a cluster of multiple servers with a proxy. IP address of proxy will be exposed to the client; proxy will then redirect the request to one of the servers in cluster. Cluster is completely transparent to the client.

Important Terms of Clustering

Load Balancing
Assuming we have 4 servers in cluster, proxy can redirect the request to any one of the servers depending on different algorithms like round robin or random number etc. Considering 20 requests per second and algorithm as round robin, each server will receive 5 requests each. This is the simplest way to achieve load balancing. In the current case we are assuming proxy as load balancer (meaning a software load balancer), generally, one can go for hardware load balancer which is much more flexible. Simple load balancing will suffice in some scenarios where you are accessing sites which do not need to maintain any state of the client. For example if you go to any tourism site where your searching for flights from some source to destination or getting some kind of catalog information based on specific inputs.

Sticky Session
If you need to retain some data for that particular client (maintaining state of the client) then we need some changes. If you go to web mail account, then depending on your login, inbox, draft contents are different. Before login, your request can be served by any one of the four servers, but as soon as you login (which means your session is created one some server and your state is maintained), all your rest of the requests need to go to the same server. This functionality of redirecting your request to the same server on which session is created is called "server affinity' or 'sticky sessions'. So once the session is created on one server, we will not be doing load balancing for all consequent requests received for that session.

Fail Over
Now we have a session on one server and we continue to do some work with the same session. What if the server fails? Option one is to loose the complete data because of this failure or we need to get redirected to some other server which will have information about our session. So we will not have 'load balancing' but we need to have 'failover'. Usually fail over is little tricky compared to load balancing. Server or Cluster administrator is usually responsible for configuring how session state should be made available to other servers.

Application Servers and Clustering

Most of the J2EE application servers support clustering. Since WebSphere, WebLogic are couple of important applications servers, we will just discuss basic support of these two servers for clustering. In recent releases of these servers, creating a cluster and managing a cluster is made extremely simple. Both the servers support creation of software proxy with required plug-in configuration, which helps us to test a simple cluster environment with one machine/node. Managing state using WebSphere is given through admin console, and for WebLogic it can be done using vendor specific deployment descriptor. Both the servers allow us to maintain state in either memory or persistent store.

In this article, we have discussed the basic concept of cluster, in the next article we will see steps for creating WebLogic and WebSphere cluster.