This is an old revision of the document!


Change Management

When to use it

To follow when there is anything modifying a production service : to limit unforeseen consequences after modifiying these production services.

Production scope

Process (lightweight)

Preparation

3 weeks (or more) before the targetted date for the change

  • Prepare all you can to limit what can go wrong on this change.
  • Prepare tests : what validates that the change is done and didn't break anything.
  • Estimate the time window needed for the change.

Validation phase

At 2 weeks before

  • Pre-check what should be done on D-day with Pierre André :
    • What will be modified
    • How to test it
    • The downtime duration requested

During that week, correct what needs to be corrected after this pre-checking with Pierre André

Communication

1 week before

  • Validate that everything is ready to run the change : GO / NOGO from Pierre André
  • Send the communication (planned maintenance template link) either by email to Pierre André and all the people affected by the change / the Teams group (eg: ISC Learn Platform)

1 day before

  • Send a notification to remind of the change tomorrow

Implementation

On D-day

  • Go on Infomaniak's dashboard > VPS > Hannibal > Snapshot to take a snapshot on the VM data only (Hannibal for ISC Learn) : The system snapshot feature at Infomaniak seems broken (repeated failed tests from Hasdrubal)
  • NOTE : the data snapshot takes about 2h30 on Hannibal
  • Send a start notification at the beginning (Start template link)
  • Implement the change
  • Run tests

If all is fine in the change window

If there is an issue in the change window

  • Continue debugging until at most 30min before the end of maintenance window
  • Either it is solved and you can send the end of maintenance notification (End of maintenance template link)
    • or if the issue is not solved 30 min before the end : send a new notification with the estimated extra delay to fix the issue.
      • if the issue will take way longer than acceptable extra time to fix, then send a incident notification (Incident start template link) and either try to fix with files from the SSD NAS backup / small desktop NAS backup, or to restore the Infomaniak data snapshot
      • Warning : it takes about 1h to restore a data snapshot
      • Prepare to go on with the Incident Process in case the snapshot restoration fails to solve the issue.
Edit this page
Back to top