Change Management
When to use it
To follow when there is anything modifying a production service : to limit unforeseen consequences after modifiying these production services.
Production scope
Process (lightweight)
Preparation
3 weeks (or more) before the targetted date for the change
- Prepare all you can to limit what can go wrong on this change.
- Prepare tests : what validates that the change is done and didn't break anything.
- Estimate the time window needed for the change.
Validation phase
At 2 weeks before
- Pre-check what should be done on D-day with Pierre André :
- What will be modified
- How to test it
- The downtime duration requested
During that week, correct what needs to be corrected after this pre-checking with Pierre André
Communication
1 week before
- Validate that everything is ready to run the change : GO / NOGO from Pierre André
- Send the communication (planned maintenance template link) either by email to Pierre André and all the people affected by the change / the Teams group (eg: ISC Learn Platform)
1 day before
- Send a notification to remind of the change tomorrow
Implementation
On D-day
- Go on Infomaniak's dashboard > VPS > Hannibal > Snapshot to take a snapshot on the VM data only (Hannibal for ISC Learn) : The system snapshot feature at Infomaniak seems broken (repeated failed tests from Hasdrubal)
- NOTE : the data snapshot takes about 2h30 on Hannibal
- Send a start notification at the beginning (Start template link)
- Implement the change
- Run tests
If all is fine in the change window
- Send the end of maintenance notification (End of maintenance template link)
If there is an issue in the change window
- Continue debugging until at most 30min before the end of maintenance window
- Either it is solved and you can send the end of maintenance notification (End of maintenance template link)
- or if the issue is not solved 30 min before the end : send a new notification with the estimated extra delay to fix the issue.
- if the issue will take way longer than acceptable extra time to fix, then send a incident notification (Incident start template link) and either try to fix with files from the SSD NAS backup / small desktop NAS backup , or to restore the Infomaniak data snapshot
- Warning : it takes about 1h to restore a data snapshot
- Prepare to go on with the Incident Process in case the snapshot restoration fails to solve the issue.
