1. Example scripts for using PBS
-
Pay attention to the comments (prepended by # alone)
-
Notice the PBS directives (prepended by #PBS )
1.1. Parallel jobs
A PBS parallel job (using the gigabit ethernet based MPI in this particular case)1.1.1. script file example.csh
#!/bin/csh # invoking mpirun on ITRDA Linux cluster # # All PBS options start as "#PBS " and can be specified on the command line # after qsub instead of being embedded in the script file. #---------------------------------------------- # o Queue name # -q queue # Queues available on itrda are: # four (2hours,16nodes),four-twelve (12hours,26nodes),long (168hours,64nodes) #PBS -q four #---------------------------------------------- # o Job name instead of the PBS script filename # -N Job name (use a distinguishing name) #PBS -N Halo #---------------------------------------------- # o Resource lists # -l resource lists, separated by a "," # To ask for N nodes use "nodes=N" # To ask for 2 processor per node use ":ppn=2", otherwise ":ppn=1" # after the nodes=N. Preferably use ppn=2 and ask for less nodes. # To ask for Myrinet use ":myrinet", for Gigabit Ethernet use ":gigabit" # after the nodes=N:ppn=M # To specify total wallclock time use "walltime=hh:mm:ss" #PBS -l nodes=16:ppn=1,walltime=00:10:00 #---------------------------------------------- # o stderr/out combination # -j {eo|oe} # Causes the standard error and standard output to be combined in one file. # For standard output to be added to standard error use "eo" # For standard error to be added to standard output use "oe" # # o stderr/out (specify them instead if getting script.[oe]$PBS_JOBID # -e standard error file # -o standard output file # You can append ${PBS_JOBID} to ensure distict filenames #PBS -e surf.gigabit.stderr #PBS -o surf.gigabit.stdout #---------------------------------------------- # o Starting time # -a time # Declares the time after which the job is eligible for execution. #---------------------------------------------- # o User notification # -m {a|b|e} # Send mail to the user when: # job aborts: "a", job begins running: "b", job ends: "e" #PBS -m ae #---------------------------------------------- # o Exporting of environment # -V export all my environment var's #PBS -V #---------------------------------------------- # Begin execution # # Check the environment variables # #printenv # # get PBS node info # echo $PBS_NODEFILE cat $PBS_NODEFILE #---------------------------------------------- # cd to the working directory from which the job was submitted # cd $PBS_O_WORKDIR # # Run the MPI code # # Using Myrinet and the Intel compilers (not working yet) #/usr/local/pkg/mpich-gm/mpich-1.2.6..13-gm/ifc/bin/mpirun -v -machinefile $PBS_NODEFILE -np 16 ./Halo.ifc8-myri # Using Myrinet and the GNU compilers #/usr/local/pkg/mpich-gm/mpich-1.2.6..13-gm/g77/bin/mpirun -machinefile $PBS_NODEFILE -np 16 ./Halo.g77-myri # Using Gigabit Ethernet and the Intel compilers #/usr/local/pkg/mpich/mpich-1.2.6/ifc/bin/mpirun -machinefile $PBS_NODEFILE -np 16 ./Halo.ifc8-gige # Using Gigabit Ethernet and the GNU compilers /usr/local/pkg/mpich/mpich-1.2.6/g77/bin/mpirun -machinefile $PBS_NODEFILE -np 16 ./Halo.g77-gige # # Exit (not strictly necessary) # exit
1.1.2. Submission
-bash-2.05b$ qsub example.csh 160.itrda -bash-2.05b$ qstat Job id Name User Time Use S Queue ---------------- ---------------- ---------------- -------- - ----- 81.itrda STDIN cnh 0 T four 160.itrda Halo ce107 0 Q four
1.1.3. Current Issues
-
Resource gigabit (unlike resource myrinet) is not accepted by qsub.
-
The qualifier ppn=1 appears not to be observed by the queueing system.
-
PBS_JOBID cannot be used in the -o and -e options.
-
The Myrinet/Intel compiler combination will not work currently.
-
Take care to separate options passed to #PBS -l with "," not ":" apart from the case of "ppn=" and "myrinet" or "gigabit".
-
Specifying a good estimate of the wallclock time (in order to be safe, say 1.1 times what you expect the completion time to be) is essential to the cluster becoming an efficient resource to the community. Benchmark your code to get such estimates!