Zen and the art of installing the loadcpulimit patch. ===================================================== (For those not installing newer kernels, which already contain the loadcpulimit patch.) Obtain a "pure vanilla" set of sources and install them. These may be found at kernel.org. Go to the openmosix kernel sources base directory, i.e. cd /usr/src/ Copy the patch for loadcpulimit into this directory. cp loadcpulimit_v0.1.patch.tgz /usr/src/ tar -zxf loadcpulimit_v0.1.patch.tgz Apply the patch: patch -p1 < loadcpulimit_v0.1.patch Now, configure your kernel (e.g. do make menuconfig or make xconfig) and check the 'Lad Limit' option under the Open Mosix configuration options. Build your kernel and install it. (See /usr/source//README for instructions on this). LOADCPULIMIT ============ This new functionality gives openMosix users the ability to restrict the load imposed by the various processes running on an openMosix cluster. Through the use of the various parameters relating to LOADLIMIT, it is possible to prevent a node from being overwhelmed by jobs running on it. It is now possible to specify how much CPU time a particular node will allow to be consumed by openMosix jobs running on it, so as to control exactly what load that node will accept and to forcibly migrate any jobs that impose to great a load away to other nodes. New entries have been added under /proc/hpc/admin and /proc/hpc/nodes/ /proc/hpc/admin/loadlimit Enables the load limiter and sets the maximum load to 800. This must be used in conjunction with the following LLIMITMODE setting to determine the type of processes this applies to. LLIMITMODE: ----------- Restricts the TOTAL LOAD on this node. This includes both local processes and those that were migrated to this node. DEFAULT SETTING: 0 (Disabled.) POSSIBLE SETTINGS: 0, 1, 2 0: Total load may not exceed that specified by loadlimit (see above, esp. the ERRORS? section.). Any local or remote processes that impose a load greater than loadlimit will be expelled or migrated away from this node as appropriate. 1: Total load imposed by processes that have migrated to this node (Remote processes) may not exceed the load specified by loadlimit. Processes that increase the load beyond 'loadlimit' will be expelled or migrated away from this node. 2: Total load imposed by processes that are local to this node (not originating from remote nodes) may not exceed 'loadlimit'. Processes that cause the load on this node to exceed 'loadlimit' will be migrated or expelled. EXAMPLE: echo 1 > /proc/hpc/admin/llimitmode Ensures that 'loadlimit' will only be applied to remote processes and not the local ones. ERRORS? Is the loadlimiter going to migrate critical local processes under certain high-load circumstances with llimitmode=2 or 0, such that those critical local processes might end up not running on their home node? I can imagine, for example, that migrating away a locally running CRON daemon, database program, or disk-driver would be extremely bad, for example. CPULIMIT: --------- Allows the user to specify the amount of CPU time that may be permitted. In other words, where normally, the CPU(s) may run 100% of the time at most, cpulimit allows the user to specify a lower maximum setting for the CPU load. This is not particularly useful in and of itself (as it effectively could be used to turn a high-end quad-Xeon box into something akin to a 486-25SX), but, in conjunction with the cpulimitmode setting below, it could be quite handy in limiting the amount of CPU time various types of processes are permitted. In short, where loadlimit governs the load imposed on the system, cpulimit governs the amount of actual CPU time used by specified processes. DEFAULT SETTING: 0 (Disabled.) POSSIBLE SETTINGS: 0 to N00, where N is the number of CPUs in this node. The CPU load is measured in percent, with the full usage of one CPU being equal to 100 percent. Thus, for example, allowable values on a quad-Xeon system would be 0 (Disabled) to 400%, (full power on all engines!) EXAMPLE: echo 160 > /proc/hpc/admin/cpulimit Sets a limit of 160 % (80 percent per CPU) on a dual processor node (or 40 percent per CPU on a quad-processor system). CPULIMITMODE: ------------- This works much like lloadlimitmode to specify the types of processes to which the cpulimit command, above, applies. DEFAULT SETTING: 0 POSSIBLE SETTINGS: 0, 1 ,2 0: Total CPU usage on the system (both remote and local processes) must remain below 'cpulimit'. Processes that would increase this the CPU usage to a level higher than 'cpulimit' will be expelled or migrated away until the CPU usage drops to a lower level and they can be accomodated. 1: Total CPU usage on this system by REMOTE PROCESSES only may not increase to a level higher than that specified by 'cpulimit'. Remote processes that would cause 'cpulimit' to be exceeded will be migrated away or expelled until the CPU usage drops to more acceptable levels. 2: Total CPU usage of all local processes may not exceed the value specified by 'cpulimit'. Any local processes that impose a CPU usage that results in an increase beyond 'cpulimit' will be expelled or migrated away until the CPU usage drops to more acceptable levels. EXAMPLE: echo 2 > /proc/hpc/admin/cpulimitmode Forces the setting entered in 'cpulimit' to be applied to only local processes on this node. /PROC/HPC/NODES// ============================= LOADLOCAL: ---------- This parameter retrieves the load currently imposed by local processes on the node specified. EXAMPLE: cat /proc/hpc/nodes/485/loadlocal [response] 167 Local processes are imposing a load of 167 on node 485. (NOTE: I do not know what units this are, so I cannot say whether 167 is a light load or a heavy one.) LOADREMOTE: ----------- This parameter retrieves the load currently imposed by remote processes on the node specified. EXAMPLE: cat /proc/hpc/nodes/485/loadremote [response] 711 Remote processes are imposing a load of 711 on node 485. (NOTE: I do not know what units this are, so I cannot say whether 711 is a light load or a heavy one.) LLIMITMODE: ----------- This parameter retrieves the current setting of llimitmode on the specified node. See above for a discussion of llimitmode and how to set it. EXAMPLE: cat /proc/hpc/nodes/485/llimitmode [response] 0 llimitmode is disabled (the settings for loadlimit are applied to both local and remote processes.) LOADLIMIT: ---------- This parameter retrieves the current setting of loadlimit on the specified node. See above for a discussion of loadlimit and how to set it. EXAMPLE: cat /proc/hpc/nodes/484/loadlimit [response] 800 loadlimit is set to a maximum of 800. This, in combination with the setting for llimitmode will determine the maximum load that will be permitted on this node for either local, remote or both types of processes. CPULOCAL: --------- This parameter retreives the current CPU utilization imposed upon the node by local processes. This is reported as 0 to N00, where N is the number of CPUs on the node. EXAMPLE: cat /proc/hpc/nodes/485/cpulocal [response] 80 The system is spending 80% of its CPU 'horsepower' running local processes. CPUREMOTE: ---------- This parameter retreives the the current CPU utilization imposed upon the node by remote processes. This is reported as 0 to N00, where N is the number of CPUs on the node. EXAMPLE: cat /proc/hpc/nodes/493/cpuremote [response] 199 The system is spending 199% of its CPU 'horsepower' running remote processes. Clearly the system is at least a dual-processor machine. CPULIMITMODE: ------------- This parameter retrieves the current setting for CPULIMITMODE for this node. See above for a discussion of cpulimitmode and how to set it. EXAMPLE: cat /proc/hpc/nodes/485/cpulimitmode [response] 0 The parameter is disabled (or configured so that any setting to CPULIMIT applies to both local and remote processes.) CPULIMIT: --------- This parameter retrieves the current setting for CPULIMIT for this node. See above for a discussion of cpulimit and how to set it. EXAMPLE: cat /proc/hpc/nodes/492/cpulimit [response] 80 The parameter is set so that a load of no more than 80 will be imposed on the node.