This is an old revision of the document!
What still needs to be fixed
Apptainer
Make that everyone exports (after adaptation of course) :
export APPTAINER_CACHEDIR=/scratch/gpfs/$USER/APPTAINER_CACHEexport APPTAINER_TMPDIR=/tmp
To prevent quota explosion
Content
- Explain to PA how to create a proper structure.
- Where do we put the content of this file, as some information is not intended for the general public
- How can we make animations on the Wiki using JS + SVG and stuff ? Snow ?
Create a proper structure, starting with the infrastructures✔ done, after discussion merge into docs: for both groups- With a section for students
- With a section for teachers
Make a limited-access location with the critical information✔ done- Construct the tools for teacher sections with the existing informations
Monitoring
- alerting and messaging on various metrics (disk space, cpu usage, …) for the various computational resources (chacha, disco, calypso & others)
Server room
- Make something nice there. Posters on the walls, screens, stuff.
- Why is there still a box for a server in the networking lab room ? : Because we need at least one for sending back in case of support / we needed one for network labs to hide the CTF network setup
- Rename networking lab room and change the remplaçant for Darko as well
- Why only 10 GB for the fiber
- I don't want a patch panel inside the server rack but outside of it. Space will be premium soon there and we don't know where to put the server rack : Search and buy a patch panel to put in the room
- Find a proper layout for the server room for accommodating a water-cooled rack and maybe another one in a couple of months
- If we have Rumba running there, we need some UPS solution.
- Do a drawing schematic of the future rack, notably for having a proper rumba failover policy
- Choose new R630 and R730 / R740 for RUMBA main. Budget 3 kFr
- Do we need a file server from the guys downstairs (baignoire)
- Remove again the big oven : check with Hervé Girard to store it in 23N322 : this is where a student used it last time (but nicely returned it to N307), why not keep it there ? EDIT: the RoL of N322 is Thomas Sterren, I sent him a message for the oven. (Rémi) / EDIT2 : answer is “we share the room so it will stay in 307. period.” : need to find a place to store it ourselves.
Slurm on chacha or disco
Make both GPUs available in gres/slurmd confs✔ doneMake emails working for start/end of jobs, use an emailer✔ done- Find how to do the ressource partitioning with billing credits by user / account
- Discuss how to allocate credits for users : what about students ?
Note everywhere to either remove sshfs for VScode, and give links to properly configure it or no VScode at all :Noted on runjob and started script to check for .vscode in homedirs : auto-rm in crontab directly ?
Calypso
- Reinstall slurm by compiling with all necessary plugins,then package using debuild : https://slurm.schedmd.com/quickstart_admin.html#debuild , then deploy the .deb by Ansible
Rumba
- Turn on Rumba and install a proper env for us, mainly based on docker as a limited number of members will use it
- Test backup and replicate ISC / Learn on Rumba
- Migrate the wiki there
- Migrate ISC / Learn there ? TBD
- Have VPS and cloud coder there, please.
Hannibal
- Backup DokuWiki : ✔ Done already, Hannibal has /srv/www completely backuped on the Synolog NAS DS923
- Add Ingegamez website on wordpress
Site
- Proper CSS for title, also for the alignment which is ugly (look at this page!)
- Editor with no tabs
- Why is there a search box with the same text ?
- Rights done properly for every ISC member
