
Kruskal cluster is consist of 72 - dual Quad -Core AMD Opteron Sun Fire X2200 M2 nodes (total 576 cores). They are interconnected by three Infiniband DDR switches. These systems share a single Red Hat Enterprise Linux 4 Operating System image and follow all portal setup conventions.
As part of the PPPL cluster, a dedicated kruskal queue was created for running batch jobs on the kruskal cluster. Conceptually, the cluster is divided into two partitions: a 64 node big partition (512 cores) and an 8 node small partition (64 cores). There is a soft division between the two partitions. Jobs requesting 64 cores or less will run on small partition. Jobs requesting more than 64 cores will run on the big partition. Jobs requesting more than 512 cores will run on both partitions.
The batch system will allocate computing resources based on a user request as well as batch policy. Currently, the cluster is not intended to run small jobs with less than 16 cores, which can be accommodated by the Kestrel or Kite clusters. The small partition is currently open to all Theory users. Please contact UnixAdmin@pppl.gov for setting up access permission for running jobs that require the big partition.
1. Users are highly recommanded to build the executables with "pathscale/3.2" and "openmpi". Applications built with MPICH will run on kruskal through IPoIB, but this won't be able to fully utilize underlying Infiniband capacity through RDMA.
2. OpenMPI is integrated with the batch system. It uses a considerable amount of shared memory with RDMA technology. Users running openmpi need to increase this limit:
for a C shell user, put the following line in ./cshrc file: limit memorylocked 1048576
for a Bash shell user, put the following line in ./bashrc file: ulimit -l 1048576
3. OpenMPI attemps to determine which BTL driver to use at run time. The following example shows how to use the mca option (--mca) to instruct it to use InfiniBand driver:
mpirun --mca btl openib -np 512 myprog
Note: This is only recommended for older versions of openmpi, such as version 1.2.7. "--mca btl openib" became obsolete in version 1.3 and up. You do not need mca option anymore. Adding it improperly could stop your program from running.4. Here are two sample job scripts:
For jobs running on kruskal-big partition
#PBS -q kruskal
#PBS -N my_job
#PBS -l nodes=64:ppn=8,mem=1025795mb
#PBS -r n
#PBS -l walltime=24:00:00cd $PBS_O_WORKDIR
date
mpirun --mca btl openib -np 512 ./prog
date
For jobs running on kruskal-small parition
#PBS -q kruskal
#PBS -l walltime=2:00:00,mem=64000mb
#PBS -l nodes=32
#PBS -N my_jobtime mpirun --mca btl openib -np 32 ./prog
.