Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
administratif:compute [2024/12/10 12:36] – [Disco & ChaCha] remiadministratif:compute [2024/12/12 14:08] (current) – removed remi
Line 1: Line 1:
-===== The ISC Computational Center ===== 
- 
-We are currently **test driving** a strategy for mutualizing computational resources within HEI. The current policy is therefore designed to be temporary and will evolve as we understand better the usage patterns and requirements of various users.  
- 
-The project is expected to have three phases, which are subject to change: 
- 
-  * **Initial phase** (until March 2025): we run a best-effort, priority-less offering. The goal of this phase is to test-drive some of the tools we use (Slurm, Apptainer, etc.), estimate the demand and requirements from users, and collect data on usage patterns, e.g., periods of congestion and overall usage ratio.  
-  * **Expansion phase** (April-July 2025): we introduce priorities based on investments and projects' criticality and we stabilize the toolchain, with an improved documentation and more formal support to the users. We formalize and possibly extend the scope of the offering to more research labs and institutes within HEI. 
-  * **Exploitation** (August 2025-onwards): depending on the success of earlier phases, we introduce service-level agreements (SLA) and a tarification for users to contribute financially to the platform based on their usage patterns.  
- 
-If you need access, please use the [[https://forms.office.com/e/zRkBFAbKD7|following form]] to make a request. Once your request has been received, we will either provide you with access immediately or organize a meeting to discuss any specific requirements you might have. 
- 
-The following rules apply in this initial phase of the project: 
- 
-  * All users must submit [[https://apptainer.org/docs/user/latest/quick_start.html|Apptainer jobs]] via Slurm. An **example on how to build and run Apptainer images** definition is provided int the [[infra:apptainer_sampel|apptainer sample page]]. The rationale for this restriction is to ensure we keep visibility on the cluster usage and prevent maxing out cluster resources. 
-  * Users must regularly clean the Apptainer cache using ''apptainer cache clean'' <wrap hi>It's probably better to define another location for apptainer cache such as /tmp/</wrap> 
-  * Users are expected to search for solutions and contribute to the documentation whenever possible. Remember that this is a best-effort service without any guarantee of availability and/or support. 
-  * There is a limited prioritization: in the future, users will be assigned tokens depending on criteria to be approved by the steering committee. During the initial phase, a manual tuning will be performed. Please note that we don't really expect this constraint to be a problem, as the current number of users is limited. 
- 
-===== Computational Resources ===== 
- 
-The available computational resources currently available are composed of 
-  * [[administratif:compute#Calypso|The Calypso cluster]] 
-  * [[administratif:compute#Rumba|The Rumba production servers]] 
-  * [[administratif:compute#Disco_&_ChaCha|DISCO and ChaCha supercomputers]] 
- 
-==== Calypso ==== 
- 
-It is currently composed of the following machines :  
- 
-  * 1 DELL R740XD (Master) 
-  * 12 DELL R630 (6 currently active) 
-  * 3 DELL R630 (1 spare in 23N321, storage 1) 
-  * 1 DELL 7920 (in Rumba, currently shutdown) 
- 
-It can be accessed via a [[infra:wireguard|Wireguard VPN]] (ask Rémi for access). 
- 
-All informations about this infra are in [[infra:calypso|Infrastructure : Calypso]] 
- 
- 
-==== Rumba ==== 
- 
-Those are the production servers, running various ISC services (such as Moodle, this Wiki and other tools). Feel free to ask for more if required! 
- 
-All informations about these servers are in [[infra:rumba|Infrastructure : Rumba]] 
- 
-==== Disco & ChaCha ==== 
- 
-Research servers, equipped with 2x NVIDIA A100 (DISCO) and 2x NVIDIA H100 (96 GB of RAM each) (ChaCha), respectively. 
- 
-All informations about these servers are in  [[infra:disco|Infrastructure : Disco]] and [[infra:chacha|Infrastructure : Chacha]] 
- 
-==== VPS ==== 
- 
-We currently have 2 Virtual Private Server (VPS) hosted at [[https://manager.infomaniak.com/|Infomaniak]]. 
- 
-  - [[infra:hannibal|Hannibal]]  is the VPS hosting the  [[https://isc.hevs.ch/learn/ | ISC Learn Moodle platform]] 
- 
-  - [[infra:marcellus|Marcellus]] is the legacy VPS hosting various services, such as  
-    * XXX 
-    * YYY 
- 
- 
-==== Infrastructure ==== 
- 
-For the rest of the infra, see [[infra:start|Infrastructure]] 
  
Back to top