本文共 5810 字,大约阅读时间需要 19 分钟。
By default, make install installs all files in /usr/local/bin, /usr/local/lib, /usr/local/sbin,/usr/local/include, and /usr/local/man . You can specify an installation prefix other than/usr/local using --prefix as an argument to ./configure, for example:
./configure --prefix=$HOME
Verify you have environment variables configured so your system can find the shared libraries and binary files for TORQUE.
To set the library path, add the directory where the TORQUE libraries will be installed. For example, if your TORQUE libraries are installed in /opt/torque/lib, execute the following:
> set LD_LIBRARY_PATH=$(LD_LIBRARY_PATH):/opt/torque/lib> ldconfig
Cluster Resources recommends that the TORQUE administrator be root. |
/server_priv/ contains configuration and other information needed for pbs_server. One of the files in this directory is serverdb. serverdb contains configuration parameters forpbs_server and its queues. In order for pbs_server to run, serverdb has to be initialized.
serverdb can be initialized in two ways:
Restart pbs_server after initializing serverdb.
> qterm> pbs_server
The '-t create' option tells pbs_server to create the serverdb file and initialize it with a minimum configuration to run pbs_server. To see the configuration, use :
> pbs_server -t create> qmgr -c 'p s'## Set server attributes.#set server acl_hosts = kmnset server log_events = 511set server mail_from = admset server scheduler_iteration = 600set server node_check_rate = 150set server tcp_timeout = 6
A single queue named 'batch' and a few needed server attribues are created.
The torque.setup script uses pbs_server -t create to initialize serverdb, and then adds a user as a manager and operator of TORQUE and other commonly used attributes. The syntax is:
> ./torque.setup ken> qmgr -c 'p s'## Create queues and set their attributes.### Create and define queue batch#create queue batchset queue batch queue_type = Executionset queue batch resources_default.nodes = 1set queue batch resources_default.walltime = 01:00:00set queue batch enabled = Trueset queue batch started = True## Set server attributes.#set server scheduling = Trueset server acl_hosts = kmnset server managers = ken@kmnset server operators = ken@kmnset server default_queue = batchset server log_events = 511set server mail_from = admset server scheduler_iteration = 600set server node_check_rate = 150set server tcp_timeout = 6set server mom_job_sync = Trueset server keep_completed = 300
The environment variable $TORQUEHOME is where configuration files are stored. For TORQUE 2.1 and later, $TORQUEHOME is /var/spool/torque/. For earlier versions, $TORQUEHOME is/usr/spool/PBS/.
The pbs_server needs to know which systems on the network are its compute nodes. Each node must be specified on a line in the server's nodes file. This file is located at$TORQUEHOME/server_priv/nodes. In most cases, it is sufficient to specify just the names of the nodes on individual lines; however, various properties can be applied to each node.
Syntax of nodes file:node-name[:ts] [np=] [gpus=] [properties]
The [:ts] option marks the node as timeshared. Timeshared nodes are listed by the server in the node status report, but the server does not allocate jobs to them.
The [np=] option specifies the number of virtual processors for a given node. The value can be less than, equal to, or greater than the number of physical processors on any given node.
The [gpus=] option specifies the number of GPUs for a given node. The value can be less than, equal to, or greater than the number of physical GPUs on any given node.
The node processor count can be automatically detected by the TORQUE server ifauto_node_np is set to TRUE. This can be set using the command qmgr -c "set server auto_node_np = True". Setting auto_node_np to TRUE overwrites the value of np set in$TORQUEHOME/server_priv/nodes.
The [properties] option allows you to specify arbitrary strings to identify the node. Property strings are alphanumeric characters only and must begin with an alphabetic character.
Comment lines are allowed in the nodes file if the first non-white space character is the pound sign (#).
The example below shows a possible node file listing.
$TORQUEHOME/server_priv/nodes :# Nodes 001 and 003-005 are cluster nodes#node001 np=2 cluster01 rackNumber22## node002 will be replaced soonnode002:ts waitingToBeReplaced# node002 will be replaced soon#node003 np=4 cluster01 rackNumber24node004 cluster01 rackNumber25node005 np=2 cluster01 rackNumber26 RAM16GBnode006node007 np=2node008:ts np=4...
If using TORQUE self extracting packages with default compute node configuration, no additional steps are required and you can skip this section.
If installing manually, or advanced compute node configuration is needed, edit the$TORQUEHOME/mom_priv/config file on each node. The recommended settings are below.
$TORQUEHOME/mom_priv/config :$pbsserver headnode # note: hostname running pbs_server$logevent 255 # bitmap of which events to log
This file is identical for all compute nodes and can be created on the head node and distributed in parallel to all systems.
After serverdb and the server_priv/nodes file are configured, and MOM has a minimal configuration, restart the pbs_server on the server node and the pbs_mom on the compute nodes.
Compute Nodes:> pbs_mom
> qterm -t quick> pbs_server
After waiting several seconds, the pbsnodes -a command should list all nodes in state free.
转载地址:http://fpuli.baihongyu.com/