Clustering and NLB Cram notes
Evaluating the Benefits of Clustering
A cluster is two or more computers working together to provide higher availability, reliability, and scalability than can be obtained by using a single system. When failure occurs in a cluster, resources are redirected and the workload is redistributed. Microsoft cluster technologies guard against three specific types of failure:
- Application and service failures, which affect application software and essential services.
- System and hardware failures, which affect hardware components such as CPUs, drives, memory, network adapters, and power supplies.
- Site failures in multisite organizations, which can be caused by natural disasters, power outages, or connectivity outages.
Benefits of Clustering
If one server in a cluster stops working, a process called failover automatically shifts the workload of the failed server to another server in the cluster. Failover ensures continuous availability of applications and data.
This ability to handle failure allows clusters to meet two requirements that are typical in most data center environments:
- High availability. The ability to provide end users with access to a service for a high percentage of time while reducing unscheduled outages.
- High reliability. The ability to reduce the frequency of system failure.
Additionally, Network Load Balancing clusters address the need for high scalability, which is the ability to add resources and computers to improve performance.
Limitations of Clustering
Server clusters are designed to keep applications available, rather than keeping data available.
You cannot use Windows Server 2003 File Replication service (FRS) on shared cluster storage. You also cannot create domain-based Distributed File System (DFS) roots on shared cluster storage.
Your choice of cluster technologies depends primarily on whether you run stateful or stateless applications:
- Server clusters are designed for stateful applications. Stateful applications have long-running in-memory state, or they have large, frequently updated data states. A database such as Microsoft® SQL Server™ 2000 is an example of a stateful application.
- Network Load Balancing is intended for stateless applications. Stateless applications do not have long-running in-memory state. A stateless application treats each client request as an independent operation, and therefore it can load-balance each request independently. Stateless applications often have read-only data or data that changes infrequently. Web front-end servers, virtual private networks (VPNs)
virtual private networks (VPNs)The extension of a private network that encompasses encapsulated, encrypted, and authenticated links across shared or public networks. VPN connections typically provide remote access and router-to-router connections to private networks over the Internet., and File Transfer Protocol (FTP)File Transfer Protocol (FTP)A member of the TCP/IP suite of protocols, used to copy files between two computers on the Internet. Both computers must support their respective FTP roles: one must be an FTP client and the other an FTP server.servers typically use Network Load Balancing.Back-end applications and services, such as messaging applications like Microsoft Exchange or database applications like Microsoft SQL Server, are ideal candidates for server clusters.
In Windows Server 2003, Enterprise Edition, and Windows Server 2003, Datacenter Edition, server clusters can contain up to eight nodes. Each node is attached to one or more cluster storage devices, which allow different servers to share the same data.
Network Load Balancing can run on all editions of Windows Server 2003. If your application is stateless or can otherwise be cloned with no decline in performance, consider deploying Network Load Balancing. Network Load
Network Load Balancing clusters are groups of identical, typically cloned computers that, through their numbers, enhance the availability of Web servers, Microsoft® Internet Security and Acceleration (ISA) servers (for proxy and firewall servers), and other applications that receive TCP and UDP traffic. Because Network Load Balancing cluster nodes are usually identical clones of each other and can therefore operate independently, all nodes in a Network Load Balancing cluster are active.
Table 6.2 Maximum Number of Nodes in a Cluster
Operating System Network Load Balancing Component Load Balancing* Server Cluster Microsoft® Windows® 2000 Advanced Server 32 12 2 Microsoft® Windows® 2000 Datacenter Server 32 12 4 Windows Server 2003, Standard Edition 32 12 N/A Windows Server 2003, Enterprise Edition 32 12 8 Windows Server 2003, Datacenter Edition 32 12 8 Table 6.3 Maximum Number of Processors and RAM
Operating System Number of Processors Maximum RAM Windows 2000 Advanced Server 8 8 GB Windows 2000 Datacenter Server 32 64 GB Windows Server 2003, Enterprise Edition 8 32 GB Windows Server 2003, Datacenter Edition 32 64 GB Definition Terms
Node A computer system that is a member of a server cluster. Windows Server 2003 supports up to eight nodes in a server cluster.
Resource A physical or logical entity that is capable of being managed by a cluster, brought online, taken offline, and moved between nodes. A resource can be owned only by a single node at any point in time.
Resource groups A collection of one or more resources that are managed and monitored as a single unit. Resource groups can be started and stopped independently of other groups (when a resource group is stopped, all resources within the group are stopped). In a server cluster, resource groups are indivisible units that are hosted on one node at any point in time. During failover, resource groups are transferred from one node to another.
Virtual server A collection of services that appear to clients as a physical Windows-based server but are not associated with a specific server. A virtual server is typically a resource group that contains all of the resources needed to run a particular application and can be failed over like any other resource group. All virtual servers must include a Network Name resource
Network Name resourceIn server clusters, the name through which clients access server cluster resources. A network name is similar to a computer name, and when combined in a resource group with an IP address and the applications clients access, presents a virtual server to clients.and an IP Address resourceIP Address resourceFor Internet Protocol version 4 (IPv4), a 32-bit address used to identify an interface on a node on an IPv4 internetwork. Each interface on the IP internetwork must be assigned a unique IPv4 address, which is made up of the network ID, plus a unique host ID. This address is typically represented with the decimal value of each octet separated by a period (for example, 192.168.7.27). You can configure the IP address statically or dynamically by using Dynamic Host Configuration Protocol (DHCP).For Internet Protocol version 6 (IPv6), an identifier that is assigned at the IPv6 layer to an interface or set of interfaces and that can be used as the source or destination of IPv6 packets.
Failover The process of taking resource groups offline on one node and bringing them back online on another node. When a resource group goes offline, all resources belonging to that group go offline. The offline and online transitions occur in a predefined order. Resources that are dependent on other resources are taken offline before and brought online after the resources upon which they depend.
Failback The process of moving resources, either individually or in a group, back to their original node after a failed node rejoins a cluster and comes back online.
Quorum resource The quorum-capable resource selected to maintain the configuration data necessary for recovery of the cluster. This data contains details of all of the changes that have been applied to the cluster database. The quorum resource is generally accessible to other cluster resources so that any cluster node has access to the most recent database changes. By default there is only one quorum resource per cluster.
Network Load Balancing improves scalability and availability by distributing client traffic across the servers that you include in the Network Load Balancing cluster. Each cluster host (a server running on a cluster) runs an instance of the applications supported by your cluster. Network Load Balancing transparently distributes client requests among the cluster hosts. Clients access your cluster by using one or more virtual IP addresses. From the perspective of the client, the cluster appears to be a single server that answers the client request.
Some of the common applications and services that run on Network Load Balancing include:
- Web applications running on IIS 6.0
One of the most common of the solutions that use Network Load Balancing is an IIS 6.0 Web farm. A typical challenge in supporting Web applications occurs when an application must maintain a persistent connection to a specific cluster host. For example, if a Web application uses Hypertext Transfer Protocol Secure (HTTPS), the application should contact the same cluster hosts within the cluster, for efficiency. Connecting to a different cluster host requires establishing a new SSL session, which creates excess network traffic and overhead on the client and server. Network Load Balancing maintains affinity and reduces the possibility that a new SSL session needs to be established.
- VPN remote access running on Routing and Remote Access
Another solution that uses Network Load Balancing involves using the Routing and Remote Access service in Windows Server 2003 to provide VPN remote connectivity. In the VPN solution, you combine multiple remote access servers running Windows Server 2003 and Routing and Remote Access to create a VPN remote access server farm.
- Web content caching and firewall running on Microsoft® Internet Security and Acceleration (ISA) Server 2000
You can also use Network Load Balancing in solutions that include ISA Server to provide network security, network isolation, network address translation, or Web content caching. In ISA Server solutions, the design and deployment are integral parts of the ISA Server design and deployment process.
For more information on creating ISA Server designs and deploying ISA Server in your organization, see "Deploying ISA Server" in Deploying Network Services of this kit and see the documentation that accompanies ISA Server.
- Application hosted on Terminal Services
When you run applications on Terminal Services, the Terminal Services clients can be load balanced across a number of computers running Terminal Services. Network Load Balancing is combined with the Session Directory service in Terminal Server to provide improved scalability and availability for Terminal Services.
When a cluster becomes unable to respond the client requests within a specified time, you can improve the performance by scaling up your solution.
Take the following actions to scale up your solution:
- Increase system resources (such as processors, memory, disks, and network adapters) for the existing cluster host.
- Replace the existing cluster host with another system that has greater resources.
Another method that you can use to improve client response time is scaling out. You scale out your solution when you add cluster hosts to existing clusters.
CLUSSVC COMMANDS
Cluster service command Shortcut command Action /debugresmon /DR Enables the debugging of the resource dynamic-link libraries (DLLs) that are loaded by the resource monitor process. /fixquorum /FQ Allows the Cluster service to start up, despite problems with the quorum device. /resetquorumlog /RQ If the quorum log file is not found or is corrupted, creates a new quorum log file based on information in the local node's cluster database file. If the quorum log file is found and is not corrupted, this command has no effect. /norepevtlogging n/a Allows no replication of event log entries. /forcequorum /FO Restores quorum for a majority node set server cluster that has lost quorum.
Note
- It is recommended that you use the method described in To force quorum in a majority node set server cluster instead of the clussvc /debug /forcequorum command option described here.
NLB COMMANDS
Value Description help Displays the online Help. suspend [{Cluster[:Host] | all {local | global}}] Suspends all cluster operations until the resume command is issued. Suspend temporarily stops cluster operations if they were previously started. The purpose of this command is to override any remote control commands that might be issued. All subsequent cluster-control commands except resume and query are ignored. The optional parameters allow the command to address a specific cluster, a specific cluster on a specific host, all clusters on the local computer, or all global computers that are part of the cluster. resume [{Cluster[:Host] | all {local | global}}] Resumes cluster operations after a previous suspend command. This does not restart cluster operations, but enables use of cluster-control commands, including remote control commands. The optional parameters allow the command to address a specific cluster, a specific cluster on a specific host, all clusters on the local computer, or all global computers that are part of the cluster. start [{Cluster[:Host] | all {local | global}}] Starts cluster operations on the specified hosts. This enables all ports that might have been previously disabled. The optional parameters allow the command to address a specific cluster, a specific cluster on a specific host, all clusters on the local computer, or all global computers that are part of the cluster. stop [{Cluster[:Host] | all {local | global}}] Stops cluster operations on the specified hosts. The optional parameters allow the command to address a specific cluster, a specific cluster on a specific host, all clusters on the local computer, or all global computers that are part of the cluster. drainstop [{Cluster[:Host] | all {local | global}}] Disables all new traffic handling on the specified hosts. While draining, hosts continue to service opened connections and stop their cluster operations when there are no more active connections. Draining mode can be terminated by explicitly stopping cluster mode with the stop command or by restarting new traffic handling with the start command. The optional parameters allow the command to address a specific cluster, a specific cluster on a specific host, all clusters on the local computer, or all global computers that are part of the cluster. enable {vip[{:Port | :all}] | all[{:Port | :all}]} {Cluster[:{Host]| all {local | global}}} Enables traffic handling for the rule whose port range contains the specified port. The first set of optional parameters allow the command to address every virtual IP address (vip), or specific vips on a specific port rule or on all ports. The second set of optional parameters allow the command to address a specific cluster, a specific cluster on a specific host, all clusters on the local computer, or all global computers that are part of the cluster. All ports specified by the port rule are affected. If all is specified for the port, this command is applied to the ports covered by all port rules. This command has no effect if the specified hosts have not started cluster operations. disable {vip[{:Port | :all}] | all[{:Port | :all}]} {Cluster[:{Host]| all {local | global}}} Disables and immediately blocks all traffic handling for the rule whose port range contains the specified port. The first set of optional parameters allow the command to address every virtual IP address (vip), or specific vips on a specific port rule or on all ports. The second set of optional parameters allow the command to address a specific cluster, a specific cluster on a specific host, all clusters on the local computer, or all global computers that are part of the cluster. All ports specified by the port rule are affected. If all is specified for the port, this command is applied to the ports covered by all port rules. All active connections on the specified hosts are blocked. To maintain active connections, use the drain function instead. This has no effect if the specified hosts have not started cluster operations. drain {vip[{:Port | :all}] | all[{:Port | :all}]} {Cluster[:{Host]| all {local | global}}} Disables new traffic handling for the rule whose port range contains the specified port. The first set of optional parameters allow the command to address every virtual IP address (vip), or specific vips on a specific port rule or on all ports. The second set of optional parameters allow the command to address a specific cluster, a specific cluster on a specific host, all clusters on the local computer, or all global computers that are part of the cluster. All ports specified by the port rule are affected. If all is specified for the port, this command is applied to the ports covered by all port rules. New connections to the specified hosts are not allowed, but all active connections are maintained. To disable active connections, use the disable command instead. This command has no effect if the specified hosts have not started cluster operations. query [{Cluster[:Host]| all {local | global}}] Displays the current cluster state and the list of host priorities for the current members of the cluster. The possible states are: Unknown. The responding host has not started cluster operations and cannot determine the cluster's state. Converging. The cluster is currently attempting to converge to a consistent state. Prolonged convergence usually indicates a problem with cluster parameters. If this occurs, check the event logs on the cluster hosts for Network Load Balancing messages warning you about the source of the problem. Draining. The cluster has converged, and the responding host is draining active connections in response to a drainstop command. Converged as default. The cluster has converged, and the responding host is the current default (the highest priority host without a drainstop command in progress). The default host handles network traffic for all of the TCP/UDP ports not covered by the port rules. Converged. The cluster has converged, and the responding host is not the default host. The optional parameters allow the command to address a specific cluster, a specific cluster on a specific host, all clusters on the local computer, or all global computers that are part of the cluster.
queryport [{vip:]Port [Cluster[:Host] | all [{local | global}]}] Displays information about a given port rule. The first parameter specifies which port rule to query. Specify the port rule by using a port number that is within the range of the port rule that you want to query. If necessary, you can also specify a virtual IP address (VIP). The default is all VIPs. However, if a particular port rule is assigned to only a specific VIP (as opposed to all VIPs) you must specify the appropriate VIP in order for the port rule to be found by this command. The second set of optional parameters allow the command to address a specific cluster, a specific cluster on a specific host, all clusters on the local computer, or all global computers that are part of the cluster.
The information returned includes:
- Information regarding if the port rule was found or an indication that the port rule was not found
- The state of the port rule (Enabled, Disabled or Draining)
- A count of packets accepted and dropped on that port rule. These counters are reset each time the cluster reconverges. For example, if you add a host to the cluster, you should see the counters reset on all hosts in the cluster. These counters can be used as a very coarse method of calculating load balance. For example, if a particular host has accepted 5000 packets and has dropped about 10000 packets, then that host is handling approximately 33% of the load for this port rule. Be aware that these numbers are dependent on a variety of factors and should only be used as a very rough estimate of actual load weight.
reload [{Cluster | all}] (local only) Reloads the Network Load Balancing driver's current parameters from the registry. Cluster operations on the local host are automatically stopped and restarted if necessary. If an error exists in the parameters, the host will not join the cluster, and a warning is displayed. If this should occur, open the Network Load Balancing Properties dialog box to fix the problem. display [{Cluster | all}] (local only) Displays extensive information about your current Network Load Balancing parameters, cluster state, and past cluster activity. The last several event log records produced by Network Load Balancing are shown, including the binary data attached to those records. This command is designed to assist in technical support and debugging. The registry information retrieved by the display command shows what the next state of Network Load Balancing would be if a reload or some other operation that causes the driver to read the registry were to be performed. The registry information might or might not be the current state of Network Load Balancing.
params [{Cluster | all}] (local only) Displays information about your current Network Load Balancing configuration. This command is similar to the display command, however, instead of retrieving the information from the registry, the params command queries directly from the kernel-mode driver. The information displayed is therefore the current state of Network Load Balancing. (The registry information retrieved by the display command shows what the next state of Network Load Balancing would be if a reload or some other operation that causes the driver to read the registry were to be performed. The registry information might or might not be the current state of Network Load Balancing.) In addition to the configuration information, nlb params displays state variables from the kernel, including the current number of connections being maintained by Network Load Balancing and the number of dynamic allocations that have been necessary for connection tracking. ip2mac Cluster Displays the media access control address corresponding to the specified cluster name or IP address. If multicast support is enabled, the multicast media access control address is used by Network Load Balancing for cluster operations. Otherwise, the unicast media access control address is used. This command is useful for creating a static ARP entry in the router if necessary.