Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| infra:calypso [2025/05/05 08:40] – [Physical Infrastructure] remi | infra:calypso [2025/11/10 12:38] (current) – [User accounts] the staff, not Rémi marc | ||
|---|---|---|---|
| Line 6: | Line 6: | ||
| * 1 DELL R740XD (Master) | * 1 DELL R740XD (Master) | ||
| - | * 15 DELL R630 (6 currently active) | + | * 15 DELL R630 (11 currently active) |
| - | * 3 DELL R630 (1 spare in 23N321, storage 1) | + | * 3 DELL R630 (spares) |
| ===== Network ===== | ===== Network ===== | ||
| - | Calypso is located in an isolated separate network inside the school. It can be accessed via a [[infra: | + | Calypso is located in an isolated separate network inside the school. It can be accessed via a [[infra: |
| Inside this network you have a simple setup : | Inside this network you have a simple setup : | ||
| Line 21: | Line 20: | ||
| 192.168.91.0/ | 192.168.91.0/ | ||
| - | Currently there are 6 servers running in the cluster that are accessible to students for their labs : | + | Currently there are 11 servers running in the cluster that are accessible to students for their labs : |
| calypso0 : 192.168.91.10 | calypso0 : 192.168.91.10 | ||
| Line 29: | Line 28: | ||
| calypso4 : 192.168.91.14 | calypso4 : 192.168.91.14 | ||
| calypso5 : 192.168.91.15 | calypso5 : 192.168.91.15 | ||
| + | calypso6 : 192.168.91.16 | ||
| + | calypso7 : 192.168.91.17 | ||
| + | calypso8 : 192.168.91.18 | ||
| + | calypso9 : 192.168.91.19 | ||
| + | calypso10 : 192.168.91.20 | ||
| The Calypsomaster node is not available for connection to students, it contains the Kubernetes control plane : | The Calypsomaster node is not available for connection to students, it contains the Kubernetes control plane : | ||
| calypsomaster : 192.168.88.248 | calypsomaster : 192.168.88.248 | ||
| + | calypsomaster IDRAC : 192.168.88.249 | ||
| | | ||
| To work on it, see with your teacher which one are allocated to you, or if you need to run jobs on all nodes (via SLURM) you can pick any of them to run. | To work on it, see with your teacher which one are allocated to you, or if you need to run jobs on all nodes (via SLURM) you can pick any of them to run. | ||
| - | **DNS server | + | ==== NAS NFS share ==== |
| + | |||
| + | On the Calypso infrastructure, | ||
| + | |||
| + | nas (NAS appliance) : 192.168.88.250 | ||
| + | |||
| + | The filesystem is mounted from ** nas:/volume1/ | ||
| + | |||
| + | On each student' | ||
| + | |||
| + | |||
| + | ==== DNS server / Gateway ==== | ||
| In this isolated network, the Sinf provides us only their gateway as the only DNS server : | In this isolated network, the Sinf provides us only their gateway as the only DNS server : | ||
| DNS / Gateway : 172.30.7.1 | DNS / Gateway : 172.30.7.1 | ||
| + | |||
| + | |||
| Line 49: | Line 67: | ||
| ==== User accounts ==== | ==== User accounts ==== | ||
| - | User access is SSH based for now, managed | + | User access is SSH based, managed |
| Line 68: | Line 86: | ||
| calypsomaster : no SLURM | calypsomaster : no SLURM | ||
| calypso0 : SLURM controller + accounting DB | calypso0 : SLURM controller + accounting DB | ||
| - | calypso[1-5] : SLURM workers | + | calypso[0-10] : SLURM workers |
| ==== Configuration ==== | ==== Configuration ==== | ||
| - | TODO : redeploy from ISC compute center configuration | + | TODO |
| ==== Kubernetes Cluster ==== | ==== Kubernetes Cluster ==== | ||
| - | The Kubernetes control plane is on calypsomaster, | + | The Kubernetes control plane is on calypsomaster, |
| < | < | ||
| # kubectl get nodes | # kubectl get nodes | ||
| - | NAME STATUS | + | NAME STATUS |
| - | calypso0 | + | calypso0 |
| - | calypso1 | + | calypso1 |
| - | calypso2 | + | calypso10 |
| - | calypso3 | + | calypso2 |
| - | calypso4 | + | calypso3 |
| - | calypso5 | + | calypso4 |
| - | calypsomaster | + | calypso5 |
| + | calypso6 | ||
| + | calypso7 | ||
| + | calypso8 | ||
| + | calypso9 | ||
| + | calypsomaster | ||
| </ | </ | ||
| Line 124: | Line 147: | ||
| kubectl get pv | kubectl get pv | ||
| NAME CAPACITY | NAME CAPACITY | ||
| - | large-local-calypso0-volume | ||
| local-node-volume | local-node-volume | ||
| </ | </ | ||
| Line 133: | Line 155: | ||
| < | < | ||
| kubectl get pvc | kubectl get pvc | ||
| - | NAME | + | NAME |
| - | claim1500g | + | |
| claim300g | claim300g | ||
| </ | </ | ||
| Rules for persistent storage: | Rules for persistent storage: | ||
| - | * You should be aware that calypso0 is the only one to have a 1.5T SSD, and on all the other nodes there is 300G of space allocated to K8s. So to create a pod requiring for example a non-ephemeral database, you need to specify | + | * On all the nodes have up to 300G of disk space allocated to K8s. So to create a pod requiring for example a non-ephemeral database, you need to specify the PersistentVolumeClaim |
| - | * Unless you need to test a large volume of data, please use the claim300g as a priority | + | |
| * The pods will automatically launch on the node that has the right PV/PVC pair. | * The pods will automatically launch on the node that has the right PV/PVC pair. | ||
| Line 147: | Line 167: | ||
| kubectl get pods -n isc3 -o=custom-columns=NAME: | kubectl get pods -n isc3 -o=custom-columns=NAME: | ||
| NAME STATUS | NAME STATUS | ||
| - | bigwww | + | www Running |
| - | bigwww2 | + | |
| www | www | ||
| www2 Running | www2 Running | ||
| Line 155: | Line 174: | ||
| ==== Ressources GPU ==== | ==== Ressources GPU ==== | ||
| - | Each node in the cluster has Nvidia Container Toolkit installed, which allows you to use their Nvidia | + | Each node in the cluster has Nvidia Container Toolkit installed, which allows you to use their Nvidia |
| + | |||
| + | Calypso[0-9] are equipped with Nvidia A2 GPUs | ||
| + | |||
| + | Calypso[10-14] are equipped with Nvidia Tesla T4 GPUs | ||
