Modules Environment, Slurm and Spack
Purpose of Software Environment Modules
The Environment Modules package allows us to simultaneously offer many software products in multiple versions. When we upgrade software, the previous versions will still be available to users, thus avoiding unplanned disruptions of ongoing projects.
Use the Modules environment commands to coherently set and unset all paths and environment variables (e.g. PATH, MANPATH, CPATH, LD_RUN_PATH, etc.) that are necessary to use available (and sometimes conflicting) software packages on our HPC systems, by simply loading or unloading the corresponding module files.
We use various methods to install software, in particular: manual installation, Anaconda, and Spack. The orgnization of the modules on our systems reflects these differences.
On a typical UIBK system, the available sections may look like
- Application-Software
- Development
- Python+R-Anaconda3
- Spack-Instances
- Spack-leo5-20230116
Most installed third party
software packages are available in a two-level structure of module
files
application/version.
Example:
matlab/R2022a
For software installed with Spack, the structure is
software/version-toolchain[...]-hash.
The toolchain is the compiler used to build the given
package.
Example:
zlib/1.2.13-gcc-8.5.0-xlt7jpk
is the module for the zlib library built with GCC 8.5.0. The
7-character hash allows to distinguish between multiple variants
of the same software.
Some module names contain additional components, such as the MPI version used with a parallel software package.
Example:
fftw/3.3.10-openmpi-4.1.4-intel-2021.7.1-ndq6d76
is the module for the fftw linked with OpenMPI 4.1.4 built with the
Intel Classic toolchain.
When you issue a module load command, e.g.
module load application/version
the module's environment variables will be set in your current shell.
Please note: All software which comes pre-installed with the operating system, such as the default Gnu Compiler Collection (GCC) and many basic Linux utilities, can be accessed without loading modules. Different versions of the same software may be available via the Modules environment.
Setting up the Modules environment
The Modules environment will automatically be initialized when you log in.
Depending on your needs and habits, you may...
- ... manually load modules as needed (recommended if you use different packages or versions at different times), or
- ... add module load commands to your $HOME/.bashrc (recommended if you use always the same set of software packages).
Modules Environment and Batch Jobs
By default, jobs submitted with sbatch will inherit all environment variables of your current shell, including currently loaded modules.
If you want to set a job's environment variables and modules independently, do the following:
- Use the --export=NONE option on the sbatch command line or #SBATCH directive,
- Include all desired module load commands in your job script.
Example:
#!/bin/bash #SPATCH --job-name=myjob #SBATCH --export=NONE module load software-a/version-a module load software-b/version-b my-commands
Please note: in contrast to SGE, Slurm jobs will not run your $HOME/.bashrc file. So module load commands contained in .bashrc will be run only when you log in.
Working with the Module Environment
The module command has a number of sub-commands. In the following sections, we briefly discuss the most important ones.
module avail
Display a list of available modules, grouped by categories as discussed above
Example:
$ module avail ----------- /path_to_module_categories/{Application-Software|Compiler|...} ----------- application1/version1 application2/version2 application3/version3 ...
module show
The show subcommand displays all changes to your environment by a given module. This way, you can also find out about where application binaries or libraries for linking your own programs are located.
Example:
c102mf@login.leo5[0]:~$ module show matlab/R2022a ------------------------------------------------------------------- /usr/site/hpc/modules/leo5/Application-Software/matlab/R2022a: setenv MATLAB /usr/site/hpc/x86_64/generic/matlab/R2022a prepend-path PATH /usr/site/hpc/x86_64/generic/matlab/R2022a/bin prepend-path LD_LIBRARY_PATH /usr/site/hpc/x86_64/generic/matlab/R2022a/runtime/glnxa64 prepend-path LD_LIBRARY_PATH /usr/site/hpc/x86_64/generic/matlab/R2022a/bin/glnxa64 prepend-path LD_LIBRARY_PATH /usr/site/hpc/x86_64/generic/matlab/R2022a/sys/os/glnxa64
module load
With the load subcommand you can load one or more of the available module files:
module load gcc/11.3.0-gcc-8.5.0-rwipohd openmpi/4.1.4-gcc-11.3.0-gfkyxua netcdf-c/4.9.0-openmpi-4.1.4-gcc-11.3.0-zvsdgrp
Please note: Traditionally, the environment variable
LD_LIBRARY_PATH had to be set at run time so programs linked
using libraries from installed software could locate the correct
runtime objects.
In our UIBK/LEO installation, this is no
longer necessary. When you load a module installed by Spack, the
LD_RUN_PATH variable is set to include the location of the
shared objects, so the correct libraries are added to the
RPATH attribute of your executables. So, to run your program,
you need to load only the modules that you need to access shell level
commands. Libraries will be found via the RPATH attribute of
your executables and so are automatically loaded as needed.
Example:
To build your program, do
module load gcc/11.3.0-gcc-8.5.0-rwipohd module load openmpi/4.1.4-gcc-11.3.0-gfkyxua module load netcdf-c/4.9.0-openmpi-4.1.4-gcc-11.3.0-zvsdgrp make myprogram
To run your program, simply do
module load openmpi/4.1.4-gcc-11.3.0-gfkyxua mpirun -np ntasks myprogram
module list
With the list subcommand you get the list of all currently loaded module files:
user@login.leo5[0]:~$ module list Currently Loaded Modulefiles: 1) gcc/11.3.0-gcc-8.5.0-rwipohd 2) openmpi/4.1.4-gcc-11.3.0-gfkyxua 3) netcdf-c/4.9.0-openmpi-4.1.4-gcc-11.3.0-zvsdgrp
module unload/purge
With the unload subcommand you can unload one or more of the loaded module files (see list of loaded modules above):
Similarly, with the command
$ module purge
all loaded modules are unloaded at once.
module help
The help subcommand on its own gives general information about module usage. When adding a specific module to the help subcommand, some more information about this module is displayed:
user@login.leo5[0]:~$ module help openmpi/4.1.4-gcc-11.3.0-gfkyxua ------------------------------------------------------------------- Module Specific Help for /usr/site/hpc/modules/leo5/Spack-leo5-20230116/openmpi/4.1.4-gcc-11.3.0-gfkyxua: openmpi@4.1.4%gcc@11.3.0~atomics~cuda~cxx~cxx_exceptions+gpfs~internal-hwloc~java+legacylaunchers~lustre~memchecker+pmi+romio+rsh~singularity+static+vt+wrapper-rpath build_system=autotools fabrics=auto schedulers=slurm arch=linux-rocky8-icelake/gfkyxua An open source Message Passing Interface implementation. The Open MPI Project is an open source Message Passing Interface [...]
Please note: (UIBK extension) For software installed with Spack, the module help command will also display the particular set of options used to install the package. This allows you to distinguish between variants of packages whose names onyl differ by their hash. The syntax used for these specifications is described below.
Modules and Spack
New Spack versions are released approximately once per year, supporting new versions of existing software and new functionality. We will make these versions available to users by installing new Spack release-instances. We try to keep these reasonably complete by adding software as needed.
At login, your MODULEPATH will always refer to the newest Spack release-instance.
As the need arises due to requests for individual software versions more recent than provided by the stable Spack releases, we will install additional instances of the develop-versions of Spack. These will typically only contain a few software packages requested by users.
All available Spack instances may be listed by issuing
$ module avail spack spack/v0.19-leo5-20230116-release spack/v0.20-leo5-20230124-develop
The output will indicate the Spack version, install date and release status.
To access any given Spack instance or directly use Spack commands, issue
module load spack/version
This will remove the default set of Spack modules from your MODULEPATH and add the selected instance instead. Then issue module avail to obtain an overview of software installed in that spack instance.
Above command will also give you access to the spack command and provide an option to activate Spack's shell integration.
Additional information about the Modules environment
For further information concerning the Modules environment please
have a look at the man pages man module.
The official documentation can be found at the
Environment Modules
website.
Spack Specification Expression Syntax
The following output of the spack help --spec command should help understand the output of the module help name command described above.
spec expression syntax: package [constraints] [^dependency [constraints] ...] package any package from 'spack list', or /hash unique prefix or full hash of installed package constraints: versions: @version single version @min:max version range (inclusive) @min: version <min> or higher @:max up to version <max> (inclusive) compilers: %compiler build with <compiler> %compiler@version build with specific compiler version %compiler@min:max specific version range (see above) compiler flags: cflags="flags" cppflags, cflags, cxxflags, fflags, ldflags, ldlibs cflags=="flags" propagate flags to package dependencies cppflags, cflags, cxxflags, fflags, ldflags, ldlibs variants: +variant enable <variant> ++variant propagate enable <variant> -variant or ~variant disable <variant> --variant or ~~variant propagate disable <variant> variant=value set non-boolean <variant> to <value> variant==value propagate non-boolean <variant> to <value> variant=value1,value2,value3 set multi-value <variant> values variant==value1,value2,value3 propagate multi-value <variant> values architecture variants: platform=platform linux, darwin, cray, etc. os=operating_system specific <operating_system> target=target specific <target> processor arch=platform-os-target shortcut for all three above cross-compiling: os=backend or os=be build for compute node (backend) os=frontend or os=fe build for login node (frontend) dependencies: ^dependency [constraints] specify constraints on dependencies ^/hash build with a specific installed dependency examples: hdf5 any hdf5 configuration hdf5 @1.10.1 hdf5 version 1.10.1 hdf5 @1.8: hdf5 1.8 or higher hdf5 @1.8: %gcc hdf5 1.8 or higher built with gcc hdf5 +mpi hdf5 with mpi enabled hdf5 ~mpi hdf5 with mpi disabled hdf5 ++mpi hdf5 with mpi enabled and propagates hdf5 ~~mpi hdf5 with mpi disabled and propagates hdf5 +mpi ^mpich hdf5 with mpi, using mpich hdf5 +mpi ^openmpi@1.7 hdf5 with mpi, using openmpi 1.7 boxlib dim=2 boxlib built for 2 dimensions libdwarf %intel ^libelf%gcc libdwarf, built with intel compiler, linked to libelf built with gcc mvapich2 %pgi fabrics=psm,mrail,sock mvapich2, built with pgi compiler, with support for multiple fabrics