Virtual Servers (VS) or Virtual IPs (VIPs)

Virtual Servers (VS) are the key to building reliable systems. A VS is a means of distributing the job among the machines that actually do the work (physical servers). A VS has two parts – a system for distributing the work among the physical servers (the load balancer), and a system for determining whether the physical servers are running or not (the monitor). The two systems interact, so that if a physical server goes down, then it no longer gets any work sent to it; but once it is repaired then it can get things to do.

A virtual server works as follows. The monitor constantly watches the physical servers and maintains a list of which physical servers are up and which are down. When an inbound TCP connection request comes in to the load balancer, it decides which physical server is next in line to get that connection and forwards the connection to that physical server. That connection between the client and the physical server is maintained until the session is over.

One complexity arises because the connection may persist longer than a single TCP connection. HTTP is a stateless protocol, each GET request and its response is a single TCP connection by default. However, people put state in their web applications all the time (A shopping cart is state, so is “logging in” to a site). The state in the browser has to match the state in the application, or else the application won't work. The challenge is to maintain the connection and the state. The state can be stored in the browser as a hidden field in a form or as a cookie or it can be stored in the URL. The state can be stored in the application in the web server, the application tier, or the database back end. Persistence is where the application logic, the data center design, and the browser come together.

Another complexity arises because some connections may last a long time, hours or even days (think of the connection between an application and a database server, or the connection between an SSH client and an SSH server). You would like to have a mechanism for mirroring the connection so that if the load balancer failed, there would be a way for the connection to be maintained on a standby load balancer. An F5 Local Traffic Monitor (LTM) will do connection mirroring, but that's a proprietary solution.

Three ways to load balance using LVS

There are three different modes

  1. Network Address Translation

  2. IP tunneling

  3. Direct Routing

Network Address Translation rewrites the destination address (ip address and port number of virtual service) of request packets to that of selected physical server, and rewrites the source address of response packets back to that of virtual service, so that clients don't know which physical server performed the service. In this way, parallel services of multiple real servers can be grouped as a virtual service at a single IP address and port number. Network Address Translation works at layer 4 of the OSI protocol stack, that is, at the TCP and UDP level. NAT is how an F5 Local Traffic Monitor works.

From http://kb.linuxvirtualserver.org/wiki/LVS/TUN IP tunneling (IP encapsulation) is a technique to encapsulate IP datagram within IP datagram, which allows datagrams destined for one IP address to be wrapped and redirected to another IP address. This technique can be used to build a virtual server that the load balancer tunnels the request packets to the different servers, and the servers process the requests and return the results to the clients directly, thus the service can still appear as a virtual service on a single IP address. IP tunneling works at layer 3 of the OSI protocol stack, that is at IP level. Understand this more. This sounds like it is similar to N-Path but it requires a tunneling device at the physical server. N-path requires something special as well.

Direct Routing (DR) sends packets to the physical servers by rewriting the MAC address of the data frame with the MAC address of the selected physical server. It has the best scalability among all other methods because the overhead of rewriting MAC addresses is small, but DR requires that the load balancer and the physical servers be on the same physical (layer 2) network.

Load balancing using Web Servers