Using pre-installed software

Software that is most used on clusters is already pre-installed on the CÉCI clusters: Fortran and C/C++ compilers, popular interpreters like Python or R, numerical libraries such as BLAS and LAPACK, and message passing libraries (e.g. OpenMPI). In addition, several versions of these software are often installed.

If you are already familiar with an environment modules tool and you are only interested in the list of software available in the CÉCI clusters, go to the Software installed in the clusters section. Otherwise, please keep reading to learn how to enable the software you want.

Loading software with the module command

The installed software can be enabled through the use of the module command, its main options are:

module  avail | av            list available software (modules)
module  load | add [module]   set up the environment to use the software
module  list                  list currently loaded software
module  purge                 clears the environment
module  spider                list all possible modules
module  show                  show the commands in the module file
module  help                  get help

For instance, module available openmpi will list all available OpenMPI modules. Assuming it returns something like

OpenMPI/2.1.1-GCC-6.4.0-2.28
OpenMPI/2.1.1-iccifort-2017.4.196-GCC-6.4.0-2.28
OpenMPI/3.1.1-GCC-7.3.0-2.30

then with the command

module load OpenMPI/2.1.1-GCC-6.4.0-2.28

you will enable OpenMPI version 2.1.1 compiled with GCC version 6.4.0. The naming convention for the available modules is always of the form software/version-toolchain (more on the toolchain part below).

After doing this, when you run e.g. mpicc or mpirun without specifying the full path, you will be running that specific version of OpenMPI compilers or launch script.

Note

The module load command can be abbreviated as ml

Enabling or loading a given module actually sets a group of environment variables such as $LD_LIBRARY_PATH, $CPATH, $PATH, ... to point to the custom installation paths of the software. You can verify how those variables change when loading/unloading a given module. As we provide several versions of the same software and libraries, each one of them must be installed in different custom locations. It’s not possible to rely only on the standard unique location on Unix systems at /usr/lib, /usr/include, /usr/bin, ... used for typical installations with only a single version of each software.

Most of the software requires several dependencies to be built, for instance in the example above we needed at least GCC v6.4.0. Loading the module will also load all the required dependencies at build time. You can verifiy which were the rest of dependencies pulled with

module list

If you want to reset the environment (unload all the modules) use the command

module purge

You will have to do so if you want to load another version of the software, or test to compile your own software with different compiler and libraries versions.

As the module tool is in charge of dynamically modifying your environment variables, it is recommended to not modify the variables $LD_LIBRARY_PATH and $CPATH in your .bashrc, unless you know precisely what you are doing. If you are told to so in the installation instructions of a software you need, please contact us for support, that might save you a lot of time in some cases.

For advanced users, the exact behaviour of the module (setting PATHs, enviroment variables, aliases, etc.) and which are its dependencies can be discovered with the command

module show

For instance:

[you@lemaitre3 ~]$ module show HDF5
------------------------------------------------------------------------------------------------------------------------
   /opt/cecisw/arch/easybuild/2018b/modules/all/HDF5/1.10.2-intel-2018b.lua:
------------------------------------------------------------------------------------------------------------------------
help([[
Description
===========
HDF5 is a data model, library, and file format for storing and managing data.
 It supports an unlimited variety of datatypes, and is designed for flexible
 and efficient I/O and for high volume and complex data.


More information
================
 - Homepage: https://portal.hdfgroup.org/display/support
]])
whatis("Description: HDF5 is a data model, library, and file format for storing and managing data.
 It supports an unlimited variety of datatypes, and is designed for flexible
 and efficient I/O and for high volume and complex data.")
whatis("Homepage: https://portal.hdfgroup.org/display/support")
conflict("HDF5")
load("intel/2018b")
load("zlib/1.2.11-GCCcore-7.3.0")
load("Szip/2.1.1-GCCcore-7.3.0")
prepend_path("CPATH","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/include")
prepend_path("LD_LIBRARY_PATH","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/lib")
prepend_path("LIBRARY_PATH","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/lib")
prepend_path("PATH","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/bin")
prepend_path("PKG_CONFIG_PATH","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/lib/pkgconfig")
setenv("EBROOTHDF5","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b")
setenv("EBVERSIONHDF5","1.10.2")
setenv("EBDEVELHDF5","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/easybuild/HDF5-1.10.2-intel-2018b-easybuild-devel")
setenv("HDF5_DIR","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b")

the details of the compiling options for a module can be found in the installation log file. That log file can be found, once the module is loaded, at the location

$EBROOT<module name>/easybuild/*log

For instance:

[you@lemaitre3 ~]$ ml HDF5
[you@lemaitre3 ~]$ ls $EBROOTHDF5/easybuild/*log
/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/easybuild/easybuild-HDF5-1.10.2-20190430.140559.log
[you@lemaitre3 ~]$ ml FFTW
[you@lemaitre3 ~]$ ls $EBROOTFFTW/easybuild/*log
/opt/cecisw/arch/easybuild/2018b/software/FFTW/3.3.8-gompi-2018b/easybuild/easybuild-FFTW-3.3.8-20181011.025049.log

The full configure command used for the compilation is saved in the log file

[you@lemaitre3 ~]$ grep 'INFO cmd " ./configure' $EBROOTHDF5/easybuild/*log
== 2019-04-30 14:00:55,033 run.py:506 INFO cmd " ./configure --prefix=/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu  --with-szlib=/opt/cecisw/arch/easybuild/2018b/software/Szip/2.1.1-GCCcore-7.3.0  --with-zlib=/opt/cecisw/arch/easybuild/2018b/software/zlib/1.2.11-GCCcore-7.3.0  --with-pic --with-pthread --enable-shared  --enable-cxx --enable-fortran FC="mpiifort"  --enable-unsupported --enable-parallel " exited with exit code 0 and output:

Toolchains and software organisation

The modules for the available software are organised around the notion of toolchains. A toolchain is a collection of compiler and libraries that are often used and known to interoperate correctly. For instance, the foss (free open source software) toolchain comprises a bundle of

  • GCC (including gfortran and g++)
  • OpenMPI
  • OpenBLAS
  • ScaLAPACK
  • FFTW

Toolchains bundles are further organised into releases designated by a year followed by the letter a or the letter b. Releases corresponding to a specific set of versions for each of the module part of the toolchain. The versions are chosen so as to minimize the risks of bugs and incompatibilities.

For instance release 2017b of the foss toolchain contains:

  • GCC/6.4.0
  • OpenBLAS/0.2.20
  • OpenMPI/2.1.1
  • ScaLAPACK/2.0.2
  • FFTW/3.3.6

In most of the CÉCI clusters are also provided some releases of the intel toolchain, which comprises a bundle of

  • Intel compilers (icc, ifort, icpc)
  • Intel MPI
  • MKL

Note

For the pre-installed software we provide, which requires at least the set of tools bundled in a toolchain, it’s followed the convention to add to the module name for it a suffix -toolchain-YYYYx to identify the toolchain used at build time (which becomes a dependency to run this software).

Warning

Mixing toolchains (e.g. foss and intel) and releases (e.g. 2021a and 2022b) is not a good idea as it can introduce linking issues and more generally compatibility issues.

Extensions packages/modules for interpreted languages

Many extensions are installed by default for languages that are pre-installed on the clusters. First load the module corresponding to the version of the language that you want then run the corresponding following command.

For Python, you can get the list with:

$ pip freeze
ansible==2.4.2.0
argcomplete==1.7.0
Babel==0.9.6
backports.ssl-match-hostname==3.4.0.2
bzr==2.5.1
cffi==1.6.0
chardet==2.2.1
[...]

Note

If you work with Python virtual environments, make sure to create the virtual env with --system-site-packages to make the system packages visible from the virtual env.

For R, run this in a R shell:

> ip <- installed.packages(.Library)
> ip[, c(1,3)]
                  Package             Version
abc               "abc"               "2.1"
abc.data          "abc.data"          "1.0"
abind             "abind"             "1.4-5"
acepack           "acepack"           "1.4.1"
adabag            "adabag"            "4.1"
ade4              "ade4"              "1.7-8"
adegenet          "adegenet"          "2.1.0"
adephylo          "adephylo"          "1.1-10"
ADGofTest         "ADGofTest"         "0.3"
akima             "akima"             "0.6-2"
[...]

If you do not find the extension you need, you can try installing it by yourself following these procedures: Installing languages extensions