University of Birmingham

BEAR

Navigation Section

BlueBEAR Job Submission

Introduction

There are fundamentally two services that can be provided by a large cluster such as this:

  • Capacity Computing  which is a service provided to a large and diverse user base, none of whom place long term intensive demands on system resources
  • Capability Computing which is a service provided to a small number of users who place long term intensive demands on system resources

This service is optimised as a Capacity Computing resource.

This page has the following sections

Overview of the scheduling system

Jobs on the cluster are under the control of a scheduling system to optimise the throughput of the service. This means that there are some classes of work, for example work that requires repeated access to specified nodes, that are not appropriate for this service. IT Services may be able to offer alternative facilities for this type of work; contact the IT Service Desk in the first instance to discuss any such requirements. The scheduling system is configured to offer an equitable distribution of resources over time to all users. The key means by which this is achieved are:

  • jobs are scheduled according to the queue, if any, and the resources that are requested. See the Job Resources section of this help page for more details.

  • jobs are not necessarily run in the order in which they are submitted. The scheduling system runs a fair-share policy, whereby users associated with accounts (projects) that have not used the system for some time will be given a higher priority than accounts who have recently made heavy use of the system. This means that it is not possible for any one account to block the system just by submitting multiple jobs.

  • jobs requiring a large number of cores will have to queue until these become available. The system will run smaller jobs until these resources become available, and hence it is beneficial to specify a realistic wall time for a job - this is known as backfill.

  • when short jobs are held in the job queue alongside long jobs the priority of these short jobs increases relative to the long jobs over time. This, along with the backfill mechanism described above, means that short jobs cannot be blocked by a large number of long jobs waiting to run.

Interactive work requiring significant resources, especially Graphical User Interfaces for commercial applications, should not be carried out on the logon nodes but run on the worker nodes under the control of the scheduler by running interactive jobs.

There are limits on the resources available to individual jobs and to the overall resources in use at any time by a single user. See the Resource Limits section of this page for details

top


Basic job handling

The command to submit a job is qsub. This reads its input from a file or from its standard input, the latter usually using the echo command. The job is submitted to the scheduling system, using the requested resources, and will run on the first available node(s) that are able to provide the resources requested. For example, to submit a job to return the hostname of the node on which the job is run using the default set of resources, use the command

echo 'echo $HOSTNAME' | qsub

To submit the set of commands contained in the file myscript, use the command

qsub myscript

The batch system will return a job number, for example

35929.qmaster.cvos.cluster

When this job completes there will be two files in the home directory (the one that you are when logging on) named STDIN.e35929 and STDIN.o35929. STDIN.e35929 contains any error output and also the output from any module commands. STDIN.o35929 contains the job output, which for the first example is the name of the node that the job ran on, for example u2n127, possibly a system message relating to the processes used in the job and a list of the resources used in the job.

There is a one-second delay before the qsub command is executed. Experience has shown that scripts which submit many - hundreds or more -  jobs can cause problems if there is no delay between consecutive qsub commands.

To kill a queued or running job use the qdel command. For example, to kill the previous job:

qdel 35929

top


More complex job submission

A job will start in the home directory (the one that you are when logging on) by default, not the directory that the job was submitted from. For example, the job

echo 'ls' | qsub

will give a listing of the home directory, regardless of the directory that the job was submitted from. This might not be the desired action, for example if an input file is specified in the job. To ensure that the job runs from the submitted directory, add -d $PWD to the qsub command. To submit a job to list the files in the directory that the job was submitted from:

echo 'ls' | qsub -d $PWD

(-d is the option to specify the initial directory for a job and $PWD always refers to the current directory). Similarly,

qsub -d$PWD myscript

will run the script file myscript that is in the current directory.

The output and error files described in the previous section can be merged into a single output file by the -j oe option. The following command runs the script file myscript that is in the current directory and produces a single output file:

qsub -d$PWD -j oe myscript

There are many other options; use the command

man qsub

for a fuller description of these options.

top


Interactive jobs

Jobs can be run interactively as well as in the batch system; for example when running an interactive graphical environment for one of the commercial applications. These should not be run on the logon nodes but on one of the worker nodes. To gain access to a worker node use the command

qsub -I

where the I in -I is an upper-case letter i (short for Interactive). This submits a job to give interactive access to a worker node and so will return a job number similar to:

qsub: waiting for job 4377688.qmaster.cvos.cluster to start

The job will then queue like any other job, and when it runs it will give output similar to:

qsub: job 4377688.qmaster.cvos.cluster ready
+--------------------------------------------------------------------------+
| Job starting at 2011-07-19 14:41:35 for hattonps on the BlueBEAR Cluster
| Job identity jobid 4377688 jobname STDIN
| Job requests nodes=1:ppn=1,pmem=1996mb,walltime=02:00:00
| Job assigned to node u4n061
+--------------------------------------------------------------------------+
Please modify .bashrc to tailor your needs
[username@u4n061 ~]$

where the node that is being used is shown in the prompt.

This will transfer your session to a worker node, on which you can use any of the commands that are available on the logon nodes. If the -X option is also specified graphical traffic is returned to your desktop from this node as long as Exceed and putty have been configured as recommended. As an example, the following set of commands will transfer a session to a worker node and run the Abaqus CAE graphical environment:

qsub -X -I
(wait for worker node, as described above)
module load apps/abaqus
abaqus cae

Such jobs are subject to the usual default job resources, described with examples in a following section. In particular, a longer walltime may be specified if required.

To use all 4 cores interactively on a single node, for example when developing parallel codes:

qsub -I -l 'nodes=1:ppn=4'

where the first I is an upper-case i and the second l is a lower-case L.

top


Test jobs

Some nodes are dedicated to running short test jobs - see the Resource Limits section of this page for the limits on this queue. To use this queue, add

-q bbtest

to the qsub command; for example

echo 'ls' | qsub -q bbtest

Since this queue is intended for short test jobs it allows jobs from different users to run on a node, and for more than one job to be running on a core. This means that the elapsed (wall) time may vary between runs, since a job can be affected by other jobs on the same node, but the overall throughput is higher than if the same jobs were run in the default queue. Note that

-q bbtest

has to be specified, along with any other options, to run in this queue - jobs that do not specify the queue name but asked for resources that are low enough to run in this queue will be scheduled in the standard job queue; this means that small jobs can run in the standard queue and hence be isolated from other users' jobs.

top


Job scripts

Job scripts can contain batch system options as well as Unix commands. For example, consider a file myscript containing the following:

#!/bin/bash
#PBS -l walltime=4:0:0
#PBS -j oe
cd  "$PBS_O_WORKDIR"
module load apps/intel/2011.0.084
icc -o mybin mysource.c
./mybin

The first line runs the job in the GNU Bourne Again Shell (the same shell as is used on the logon node) and the next two lines here set job options, which could alternatively been supplied as options on the qsub command. The second line sets a resource limit (with the -l option, lowercase el) of 4 hours of wall (elapsed) time. The third line requests that command errors are merged with the standard output in a single file. The fourth line changes the current directory in a job from the home directory to the one current when the job was submitted. The next line makes the Intel C compiler available to the job which is then used to compile a C program and produce a binary file called mybin. The last line runs that binary file. This whole script is submitted with the command

qsub myscript

top


Monitoring a job

The qstat command gives the status of the jobs submitted by the logged-on user.

The output from qstat consists of one row for each job with several columns, including the job Status which will typically be Queued if the job is queuing or Active if the job is Running. The Elapsed Time is also shown. If no job is found, no output is returned - in particular, using qstat with a jobid that has completed will not return any output. There are many option to qstat; use the command

man qstat

for more details. The -f option gives a full output of the job status; to get the full status of job 35929 use the command

qstat -f 35929

The output from qstat -f returns information about the resources used by a job. To show the cpu time, elapsed time, real and virtual memory used by the job  35929 use the command

qstat -f 35929|grep resources

To just show the cpu and elapsed (wall) time, which can be useful to see how efficiently a parallel job is running:

qstat -f 35929|grep resources|grep -v mem

The checkjob command can be used to view a detailed status of each job; with the command followed by the job id:

checkjob 35929

This command shows all job attributes and state information and also provides an analysis of whether or not the job can run. If the job is unable to run, this command will provide a breakdown of these reasons why. The section on associating a job with a project describes how to check if a job is being queued because of an invalid project code being specified.

The showstart command provides an estimate of job start time.

showstart 35929

The output from showstart is based on the load on the system, including the specified walltime for running and queued jobs, at the time that the command is issued. The estimated start time may be inaccurate due to jobs not requiring their stated wall time, which would lead to the job starting sooner than indicated, or higher-priority jobs being submitted and run, which would lead to the job starting later than indicated. In most cases the start time from showstart is pessimistic.

It can be useful to monitor the progress of the job as it runs by looking at the standard error and output files. By default these are written to local disk on the compute nodes and only copied to the user filestore at the end of the job. To monitor the output of the job myjob.sh as it runs there are 3 options, all of which have limitations:

  • qsub -j oe -k oe myjob.sh

The -k option puts all the output of your job into a fixed place in your home directory, named in the default way only [ name.oNNNNN ] This is a little inflexible in using the home directory only.

  • Re-direct the output of the job script using a command like this early on, at the top of the script:

exec > /home/my/own/output  2>&1

this redirects the output from the job script commands only, not system messages at the start and end of the job, which go to the normal file,

  • Re-direct the output of a particular command in the job by writing it directly to a disk file, e.g. in the job script:

mycalcprog > /home/my/own/output 2>&1

In this case it is only the output of the executable that's directed.

With the second and third method care is needed to name the output uniquely, and ideally the name would include the job number.

top


Job Resources and options

There are 4 classes of nodes in the BlueBEAR cluster that are available for general use:

  • 324 dual-socket dual-core (4 cores in all) nodes with 8 GB RAM
  • 16 dual-socket dual-core (4 cores in all) nodes with 16 GB RAM
  • 4 quad socket dual core (8 cores in all) nodes with 32 GB RAM
  • 16 dual-socket dual-core (4 cores in all) nodes with 8 GB RAM for short jobs

Jobs are scheduled onto these nodes in the following order:

  • if the bbquad queue is requested the job will be scheduled onto one of the 8 core nodes. Such nodes will only run one job at any time on the whole node, since it is expected that these will be used by large shared-memory jobs that will benefit from exclusive use of the node. This means that the queuing time for these nodes may be substantial
  • if the bbtest queue is explicitly asked for the job will be scheduled in that queue. If other resources requested by the -l option conflict with the limits for that queue the job will not run
  • If between 8 and 15 GB of memory per node is requested jobs will be scheduled onto one of the 4 core 16 GB nodes. These can run multiple jobs per user and, if not in use, will run jobs that ask for less memory
  • other jobs will be scheduled onto the 4 core/8 GB RAM nodes

There are limits on the resources available to individual jobs and to the overall resources in use at any time by a single user. See the Resource Limits section of this page for details

The following resources can be specified:

  • the walltime, which is the clock time allowed for the job. Note that this is not the CPU time;for jobs running on multiple cores the total CPU time used should be greater than the wall time. The walltime also takes account of time spent in other operations, especially disk access which, for some jobs, may be significant. Since the scheduling system only allows a node to be in use by a single user, and a core to be in use by a single job at any time (but a single job may use multiple cores, of course) in the default, but not the test, queue the walltime should be broadly repeatable between identical runs in the default queue. Specified by the walltime resource in the form hours:minutes:seconds
  • the number of nodes required. Specified by the nodes resource.
  • the number of cores per node. Specified by the ppn property of the nodes resource
  • the anticipated per-process memory use, specified by the pmem resource. This is used to inform the scheduling system of the expected amount of memory that will be used per process, but is not enforced when the job is running. A job that needs more memory that specified will still run, provided of course that the memory it actually needs is available. It is used to help the scheduling system allocate jobs in a memory-efficient way, and can also be used to request exclusive use of a node even for single-core jobs.
    On the majority of nodes there is 8 GB of memory but not all of this is available to jobs due to small system overheads; asking for 7 GB with ppn=1 will ensure that a single-core job will have exclusive use of an 8 GB node, at the cost of possibly increased queuing time depending on the load on the cluster. Asking for values between 8 and 15 GB with ppn=1 will schedule a single-core job on one of the 16 GB nodes, again at the possible expense of increased queuing time.
    The total amount of memory requested per node, which is (number of cores specified by ppn)*(memory per process specified by pmem), must not exceed 7 GB for one of the standard 4 core, 8 GB memory nodes, or 15 GB for one of the 4 core, 16 GB memory nodes. If this limit is exceed the job an error will be given when the job is attempted to be scheduled.
  • the silo on which the job will run. A silo is defined as a set of nodes that can run user jobs that are connected to a common Infiniband switch. The Infiniband  connection between nodes in a silo is faster, and has lower latency, that the connection between nodes in different silos, and so parallel jobs that use the Infiniband network for data traffic may have a lower elapsed time when run on nodes within a silo than when run on nodes in different silos. The gain will be very dependant on the parallelism of the code and especially how much of the time is spent on communication between nodes - in many cases any gain may be more than outweighed by the probable increased queuing time whilst waiting for the set of nodes on the silo to become available.
    Some nodes are reserved for system and other use, and so not every node on every silo is available to jobs. The silos are defined as:
    • U1 with 124 nodes
    • U2 with 127 nodes
    • U3 with 71 nodes
    The silos are defined using the feature property. Note the upper-case U in the definition of the silos; this is case-sensitive.

There is also an overall limit on the total number of cores that any one user may be occupying at any time, to prevent any one user having a disproportionate impact on the service. Queued jobs do not have their requested resources counted against this limit. The limit will be at least 128 and IT Services will review this periodically to ensure reasonable usage of the cluster whilst maintaining a responsive service for all users. The current value is in the table of Resource Limits in this help page.

Resources are specified using the -l (lower-case el) option to qsub. For example, to submit a script that will require 6 hours walltime on a single core:

qsub -l 'walltime=6:0:0' myscript

To request 6 hours wall time, 4 nodes and 4 cores per node with the job constrained to only use nodes in silo 2:

qsub -l 'walltime=6:0:0,nodes=4:ppn=4,feature=U2' myscript

Note that in the example above a colon is used to separate the nodes and ppn values.

To request 4 hours wall time, 4 nodes and 2 cores per node and an anticipated 3 GBytes of memory per process:

qsub -l 'walltime=4:0:0,nodes=4:ppn=2,pmem=3gb' myscript

This job requests a total of 6 GB of memory (2 cores with 3 GB per process) per node. Since this would leave just under 2 GB of memory that has not been requested on a standard node (8 GB, less the 6 GB requested in this job less a small system overhead) only jobs requesting less than 2 GB of memory submitted by the same user would be scheduled to run on the same node.

To submit a job to use 4 cores and 16 GB of memory on one of the 16 GB 4 core nodes::

qsub -l 'walltime=2:0:0,nodes=1:ppn=4,pmem=3gb' myscript

In the above example, the total requested memory per node is 12 GB (4 cores and 3 GB per process), so the job will not run on one of the 8 GB nodes. Since pmem is advisory, not a hard limit, the job will be able to use all of the available memory - just under 16 GB - on the node.

To submit a job to use all 8 cores and all of the memory on an 8 core node:

qsub -q bbquad myscript

To submit a short test job:

qsub -q bbtest myscript

To run an interactive job that will return graphical output to the desktop with a walltime of 12 hours:

qsub -IX -l 'walltime=12:0:0'

Requesting more resources than are needed will usually increase the queuing time for the job; in particular a job asking for multiple nodes and/or cores will have to queue until all of them are available. Requesting more resources than are available may give an error similar or identical to:

qsub: Job rejected by all possible destinations

This message is given if the walltime or memory request cannot be met, but requesting more cores per node (ppn) than are available does not give an error message; the job will queue but never run. This is unlikely to be resolved in the near future.

There are many other options that may be specified; some of the more commonly-used ones are as follows (in all cases, replace words in italics with a suitable value):

  • -j oe joins the standard output (the output that is sent to the screen in an interactive session) and error messages into a single file
  • -N jobname specified the job name
  • -m bea to send mail if the job becomes blocked, starts, finishes or stops
  • -M mail-address is the destination email address for mail requested by the -m option. This can be the central email address (my-mail-name@bham.ac.uk)

All of these can be specified in a script or on the qsub command. For example, if a script myjob contains

#!/bin/bash
#PBS -l "nodes=1:ppn=4,walltime=2:0:0,pmem=3gb"
#PBS -j oe
#PBS -N MathJob
cd "$PBS_O_WORKDIR"
module load apps/mathematica
math < math.in > math.out

either of the following commands will submit the job:

qsub myjob

or

echo "module load apps/mathematica;math < math.in > math.out" | qsub -l  "nodes=1:ppn=4,walltime=2:0:0,pmem=3gb"  -j oe  -N MathJob -d $PWD

where the second command is all on one line. Note that in this case the job will be scheduled to run on one of the 4 core 16 GB nodes, since the memory requested per node is 12 GB

top


Using local disk space

If a job uses significant I/O (Input/Output) then files should be created using the local disk space and only written back to the final directory when the job is completed. This is particularly important if a job makes heavy use of disk for scratch or work files. Heavy I/O to GPFS filestore such as the personal filestore can cause poor interactive response on the cluster for all users.

When a job is submitted an environment variable TMPDIR is set on each of the nodes that the job is running on to a unique directory for that job on the local filestore. This environment variable is available to any prologue and epilogue scripts as well as all process of the running job - that is, for an MPI job a directory with that name will be available on all of the nodes that the job uses. This directory is local to each node, not shared across all of the nodes associated with the job.

A typical sequence of commands to use this local disk space would be:

cd $TMPDIR
cp $PBS_O_WORKDIR/my_input .
mpiexec my_program
cp my_output $PBS_O_WORKDIR

which changes directory to the local temporary directory, copies the input file my_input from the job's working directory (note the final dot in this command to refer to the current directory), runs the parallel program my_program which creates an output file my_output in the program's working directory and copies the output back to the job's working directory so that it is available on the logon nodes. The temporary directory, and any sub-directories, are deleted when the job completes. If this is a single-core job the mpiexec command is not needed, so the commands are:

cd $TMPDIR
cp $PBS_O_WORKDIR/my_input .
my_program
cp my_output $PBS_O_WORKDIR

Many applications can specify a directory for scratch or temporary files which should also be assigned to $TMPDIR. Where appropriate, this is documented in the help for the application that is linked from the list of applications on this service.

The is a limited - typically 100 GB - of space available locally on the nodes. If this proves inadequate, especially for application workspace, please open a call with the IT Service Desk so that we can advise on the best way of running such jobs whilst minimising the impact on the overall service.

top


Associating a job with a project

From 22 March 2010 every job has to be associated with a project to ensure the equitable distribution of the limited resources on the cluster. Project owners and members will have been issued a project code for each registered project which may have  to be quoted when a job is submitted, and only usernames authorised by the project owner will be able to run jobs using that project code.

In most cases a username on the cluster is only associated with one project, and any jobs submitted by that user will be associated with their sole project and no project code needs to be specified in the job script. If a user is associated with more than one project, their default project is the first one that was registered with that username associated with it.

To see the projects that are associated with the logged-on user issue the command

mdiag -u $USER|grep ADEF

which will give a result similar to

ADEF=defproj01  ALIST=defproj01,otherproj02

where ADEF is the default project and ALIST lists all of the projects that the user is authorised to use.

If a user is registered on more than one project  a specific project code is specified by the -A option followed by the project code. For example the following command would submit a job to run under account project01:

qsub -j oe -A project01 myscript

Similarly, if a job script myjob contains:

#!/bin/bash
#PBS -j oe
#PBS -A project01
cd "$PBS_O_WORKDIR"
myscript

and is then submitting using the command

qsub myjob

it will run under the account project01. If the -A option is specified then a project code should also be specified; using -A without a following project code can have an undefined action, which will depend on any subsequent arguments, if any, on the qsub command or in the job script.

If a job is submitted using an invalid project, either because the project does not exist or the username is not authorised to use that project, no warning will be given when the job is submitted but it will just queue and not run. To see the project associated with a job, and any associated messages, use the command

checkjob -v jobname|grep account

where jobname is the job name as shown by the qstat command, for example 2372790.qmaster. A typical output if an invalid project was specified will be

Message[0] job requested invalid account 'nonsuch01'

and if a project was specified that is not authorised for the user:

Message[0] job not authorized to access requested account 'notmine01'

top


Job Environment Variables

Certain environmental variables, see the table below, are available within the Torque/PBS environment which may be used or tested in a job script. Additional variables can be exported  using the -v option of qsub.

Env Variable Value For example
PBS_O_HOME HOME value at qsub time /bb/ccc/hattonps
PBS_O_HOST Hostname at qsub time bluebear3.cvos.cluster
PBS_O_INITDIR set to the value of the qsub -d option directory (undefined by default)
PBS_O_LOGNAME LOGNAME value at qsub time hattonps
PBS_O_MAIL MAIL value at qsub time /var/spool/mail/hattonps
PBS_O_PATH PATH value at qsub time /usr/lib64/qt-3.3/bin:/usr/.......
PBS_O_SHELL SHELL value at qsub time /bin/bash
PBS_O_QUEUE the name of queue to which the job was submitted default
PBS_O_WORKDIR the absolute current directory at qsub time /bb/ccc/hattonps/
PBS_ENVIRONMENT set to PBS_BATCH or to PBS_INTERACTIVE PBS_BATCH
PBS_JOBCOOKIE a hexadecimal string unique to the job (unpredictable)
PBS_JOBID the job identifier assigned by the batch system 515122.qmaster.cvos.cluster
PBS_JOBNAME the job name defaulted or given by the user (-N)
mytest
PBS_NODEFILE the filename of nodelist file (main node script only) /cvos/local/config/torque/aux//515122.qmaster.cvos.cluster
PBS_NODENUM 0 for main or only node, 1:n-1 for others 0
PBS_VNODENUM Under pbsdsh: 0 to c-1 where c is the number of cores 0
PBS_TASKNUM Under pbsdsh: a unique task-instance counter 2
PBS_QUEUE name of the queue from which the job is executed bball
TMPDIR unique directory local to the node /tmp/515122.qmaster.cvos.cluster

top


Multiple Threaded Applications

This section is only relevant to expert users who are familiar with the difference between threads and processes. In the vast majority of case memory requests should be made as described in the section on Job Resources

The pmem parameter specifies the total memory that any process is expected to use; if the process is multi-threaded this specifies the expected total memory per process, not per thread.

Consider the case of a 4-way threaded process requiring a total of 7 GB of memory. This memory could be requested in several ways, only one of which is valid:

  1. specify nodes=1:ppn=4,pmem=7gb, then the job will never run, because moab will say that it asks for 28 GB of  memory (7 GB on each of 4 cores) on one node
  2. specify nodes=1:ppn=4,pmem=2gb, then the job will fail at run-time because pmem applies to the threads in total
  3. specify nodes=1:ppn=1,pmem=7gb, then the job will get full control of the node and have the "right" to use all 4 cores, and the memory requirement will be satisfied.

We recommend that programs which use posix-threads should always ask for the full node in this way, because otherwise their threads will compete with other users' ordinary processes for the 4 cores, which will increase all the programs' wall-time unpredictably.

Note that this is all entirely different to an MPI program which might create a number of different processes, not threads.


Parallel Jobs

Parallel jobs can either consist of a single program distributed across several cores with communication between the distributed components or the same program running in isolation on several cores. The former will usually consist of a program using a library such as MPI to provide the distributed computing facilities, whilst the latter will make use of a facility such as pbsdsh to spawn multiple copies of the same program on different nodes. Both methods are supported on this cluster. Examples of MPI programs are found on the help pages for the mpi compilers mpicc and mpif90 which are linked from the applications help page

top


Resource Limits

There are limits on the resources available to individual jobs and to the overall resources in use at any time by a single user. The limits at the time of writing this page are:

Resource default queue bbquad queue(1) bbtest queue(2)
default maximum default maximum default maximum
 walltime per job 2 hours 240 hours 2 hours 240 hours 15 minutes 15 minutes
number of nodes per job 1 limited by the maximum simultaneous cores(5) 1 2 1 2
number of cores per node 1 4 8 8 1 4
anticipated per-process memory per core 1996mb 16gb(3)(4) 4032mb 32gb(4) 1996mb 1996mb(4)

maximum number of queued and running jobs per user(5)

5000

10

1000

simultaneous cores in use per user (may be across more than 1 job and 1 queue)(6) soft limit 256
hard limit 256

Notes:

  1. To  run on the 4-processor (8 core) 32 GB memory nodes specify -q bbquad on the qsub line, as well as any other resource requests

  2. To run in the test queue specify -q bbtest on the qsub line, as well as any other resource requests

  3. The majority of nodes have 8 GB of memory with 18 having 16 GB. Specifying pmem greater than 8gb will schedule jobs to run on these large-memory nodes, but since these are a scarce resource they may queue for longer than jobs scheduled on the 8 GB nodes.

  4. In all cases, the product  (number of cores specified by ppn)*(memory pre process specified by pmem), must not exceed 7 GB for one of the standard 4 core, 8 GB memory nodes,  15 GB for one of the 4 core, 16 GB memory nodes or 30 GB for one of the 8 core, 32 GB theory nodes. If this limit is exceed the job will queue without being scheduled. Note that, at present, no warning is given if a job will not run due to the memory requested.

  5. This limit is necessary to preserve the responsiveness of MOAB to commands such as qsub and qstat. An error is given if a job is attempted to be submitted if a user has more than the maximum number of queued and running jobs.

  6. The hard limit is the maximum number of cores that can be in use by a single user at any time across all jobs, which may be running in different queues and includes any interactive jobs. If the system is busy then requests for more than the soft limit on the number of cores will be given a lower priority than requests for the soft limit or fewer cores; if the system is quiet then all requests are treated equally. This helps to prevent large jobs from blocking the cluster when it is busy whilst allowing such jobs to run when the cluster is quiet.

top


Last updated: 5 September 2011