System | Leo3, Leo3e | Mach | VSC3 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Operating System | CentOS 6.3, CentOS 7.1 | SuSE SLES 11.1 | Scientific Linux 6.6 | |||||||||
Architecture | Infiniband Cluster Leo3: 162 nodes (physical machines) 12 CPUs/node (2 sockets @ 6 cores) 1944 CPUs total 24 GB/node 2 GB/CPU Leo3e: 45 nodes (physical machines) 20 CPUs/node (2 sockets @ 10 cores) 900 CPUs total 64 (nodes 44+45: 512) GB/node 3.2 (nodes 44+45: 25.6) GB/CPU |
ccNUMA SMP 1 node = 256 vnodes (virtual nodes = processor sockets) 8 CPUs/vnode (8 cores per socket) 2048 CPUs total 8 GB/CPU 64 GB/vnode 16 TB total |
Infiniband Cluster approx. 2000 nodes 16 CPUs/node (2 sockets @ 8-cores) 32000 CPUs total Standard nodes: 64 GB → 4 GB/CPU Big nodes: 128 or 256 GB → 8/16 GB/CPU |
|||||||||
Topology |
Leo3 (Leo3e):
7 (2) units with up to 24 nodes each. Blocking factor 1:2 between
units. |
Fat Tree. Batch Scheduler allocates vnodes on a best-fit basis. | 8 islands, each consisting of up to 24 units with 12 nodes each (i.e. up to 2304 nodes total). Blocking factor 1:2 within islands, 1:4 between islands. | |||||||||
File System Architecture |
|
|
|
|||||||||
Job Scheduler | SGE | PBS pro | SLURM | |||||||||
Allocation Granularity | 1 CPU. Leo3: 1 node = 12 CPUs + 24GB GB memory Leo3e: 1 node = 20 CPUs + 64GB (nodes 44+45: 512 GB) memory Note that several jobs from different users may run on the same node, so this system is suitable for the entire range from small sequential programs, multithreaded (e.g. OpenMP) programs up to 12 (Leo3) resp. 20 (Leo3e) parallel threads, up to relatively large parallel MPI jobs using hundreds of CPUs. Please pecify realistic memory requirements to assist job placement. |
Multiples of 1 vnode (= 8 CPUs, 64 GB memory each) Mach is a special machine for large parallel jobs with high memory demands. If you need little memory or fewer than 8 CPUs per program run, please consider using another system. |
Multiples of 1
node (= 16 CPUs + 64GB, 128GB or 256 GB memory each) Each node is assigned to a job exclusively. Jobs can span many nodes up to very large parallel MPI jobs using many hundreds of CPUs. If your individual programs cannot profit from the minimum degree of parallelism (16 threads or tasks), please consider employing a job farming scheme (description by LRZ; please note that their conventions are different) or using a different system. |
|||||||||
Job Submission | qsub scriptfile |
qsub scriptfile |
sbatch scriptfile |
|||||||||
Query Jobs |
|
|
|
|||||||||
Cancel Jobs | qdel jobid |
qdel jobid |
scancel jobid |
|||||||||
Format of batch script file: All systems permit
supplying processing options as command line parameters of submit
command (qsub/sbatch) or as
directives in batch script file (recommended and described
below).
|
||||||||||||
General Scheduler Directives |
#!/bin/bash #$ -N jobname (optional) #$ -o outfile (default: jobname.ojobid) #$ -e errfile (default: jobname.ejobid) #$ -j yes|no (join stderr to stdout) #$ -cwd (run job in current directory) (default: $HOME) |
#!/bin/bash #PBS -N jobname (optional) #PBS -o outfile (default: jobid.o) #PBS -e errfile (default: jobid.e) #PBS -j yes|no (join stderr to stdout) # after last directive cd $PBS_O_WORKDIR (run job in current directory) (default: $HOME) |
#!/bin/bash #SBATCH -J jobname (optional) #SBATCH -o outfile (default: slurm-%j.out) (stderr goes to stdout) #SBATCH -D directory (run job in specified directory) (default: current directory) [-p partition]optional -p partition selects type of node by required memory (GB): partition ::= { mem_0064 (default) | mem_0128 | mem0256 } |
|||||||||
Notification Directives |
#$ -M mail-address #$ -m b|e|a|s|n (begin|end|abort|suspend|none) |
#PBS -m mail-address #PBS -m b|e|a|n (begin|end|abort|none) |
#SBATCH --mail-type=(BEGIN|END|FAIL|REQUEUE|ALL) #SBATCH --mail-user=user |
|||||||||
Resource Directives |
Run time
#$ -l h_rt=[hh:mm:]ss Tasks, Threads See Task Distribution below. Per slot virtual memory size (bytes) #$ -l h_vmem=sizeDefault: 2GB (Leo3) 1GB (Leo3e) Per slot stack size limit (bytes) #$ -l h_stack=size |
Run time
#PBS -l walltime=[hh:mm:]ssTasks, Threads
Memory
|
Run time
#SBATCH -t mm|mm:ss|hh:mm:ss|days-hh[:mm[:ss]]Nodes, Tasks, Threads
|
|||||||||
Task Distribution |
|
|
For more options and details, see man sbatch |
|||||||||
Interactive jobs |
|
|
TBD |
|||||||||
Remarks | Any non-directive (e.g. command) terminates processing of directives in script. | options may be parameterized using macros e.g. %j (job id), %u (user name) |