The queue software PBS is used to manage access to the geo-cluster compute nodes. Reference material regarding PBS can be found using the web site Online help is also available as described below.The PBS configuration on the MITgcm facility allows a job script to be written that requests a set of nodes be made available for exclusive use by that job. If the nodes (or other resources) that are requested by the job script are unavailable then the script will sit in a queue and only start executing once the resources become available. The PBS software resides in the directory /usr/pbs. Adding the directory /usr/pbs/bin to you command search PATH environment variable and the directory /usr/pbs/man to your MANPATH shell environment variable will allow you to access PBS commands and to get online help for PBS commands through the man command. A simple, example PBS job script is shown below:

# Example PBS script to run a job on the geo-cluster.
# The lines beginning #PBS set various queuing parameters.
# o -N Job Name
#PBS -N examplejob
# o -l resource lists that control where job goes. Here we ask for 3 nodes.
#PBS -l nodes=3
# o Where to write output
#PBS -e stderr
#PBS -o stdout
# o Export all my environment variables to the job
echo 'The list above shows the nodes this job has exclusive access to.'
echo 'The list can be found in the file named in the variable $PBS_NODEFILE'

typing this job script into a file and then submitting the file using the command:

qsub filename

where :-
    filename - is the name of the file into which the job script was typed (the command qsub is in the directory /usr/pbs/bin)
The following output should be written in a file called stdout:

Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.

The list above shows the nodes this job has exclusive access to. The list can be found in the file named in the variable $PBS_NODEFILE. This output shows that the job script was allocated the nodes geo-001, geo-002 and geo-003. The list of nodes that was allocated to the jobs script is accessed through the file named in the environment variable $PBS_NODEFILE, which in the case of the test job above happened to be /usr/spool/PBS/aux/1272.meinesz. However, this path name will be different for every PBS job script, so scripts always use the $PBS_NODEFILE variable. The example script does not execute any programs. However, a real script would either:

  • use rsh to log in to the nodes given in $PBS_NODEFILE and execute programs on those nodes.
  • use MPI to start a job that runs in parallel across the set of nodes listed in $PBS_NODEFILE.
The full list of PBS commands can be found in the directory /usr/bin/pbs with online manual information in /usr/pbs/man.