This section details the current SLURM configuration for the ISC Computational Center.
SLURM has been installed from tarball, version : 24.11.0
All official plugins installed :
Install / upgrade process TODO
TODO
This is the default partition : is is currently composed of Chacha and Disco.
These partitions can be used to restrain an account to use only one server.
More details of partitions usage TODO
Accounts have been created in 2 groups :
There are 2 other groups : Test and temp : Test is only for administration purpose, and temp is a locked group either to migrate someone from another account (can't delete an account when someone has it as a default account) or to disallow someone to run jobs (MaxSubmitJob=0)
TODO
Current limits on QOS and accounts applied to each project account :
TODO : complete / modify until limits are finished
Fairshare is one way of priorizing jobs in the job queue.
Then on each account, everyone has a 100 fairshare : SLURM is just making calculation then on the percentage using both the share in each group, and the parent account premium / standard.
Users fill their form from the page ISC Computational Center , then their information is used to create their SSH access, and their SLURM user in its project account, according to the SLA / QOS we can provide : Premium or Standard.
Several users can share the same project account to work as a team. (Limits are applied both to each user for maximum limits and the group for group limits)
The staff creates and configure user accounts.
Currently, all the SLURM configuration is manually backuped (files, DB)
TODO : automate and redirect to backup server when its ready
For user data, currently compute users have to request some space on the filer01.hevs.ch server from the Sinf : there is a request form in "Demande de service" in “Comptes et accès > Obtention d'accès réseau” . More documentation to see with the Sinf here , generally for researchers the filesystem to ask for is the fs_projets.
Consider all space on the Compute center to be short-lived, it is not made to store data as backups : only temporary to compute and get results.