Major incident on our hosting infrastructure in Luxembourg
Since January 8th, 2020, a storage incident has been impacting some of our IAAS/PAAS customers at our Luxembourg datacenter.
- we have managed to import the ZFS data pool
- the copying of the data to another storage unit is still in progress
- about 50% of the copy is complete at this time
- the remaining steps of the data recovery can only begin once this copy has been completed
- we do not yet have any guarantee with regards to data integrity
- Domain names
- DNS
- SSL Certificates
On Wednesday January 8, 2020 at 6:53 AM Pacific (14:53 UTC), an incident occurred on one of our ZFS storage units used for our PaaS and IaaS hosting services (Gandi Simple Hosting and Gandi Cloud respectively).
The storage unit became unavailable, prompting an interruption in service for all PaaS and IaaS services using the disk associated with that unit.
We followed the established procedures:
- move the control of data to an emergency machine
- inform customers impacted by the incident by email
In addition, we communicated live about the incident from when we first became aware of it via our Twitter accounts @gandinoc, @gandi_net, and @gandibar.
The data import on the emergency machine was not possible due to a corruption of the meta-data that we are not aware of the cause of.
We’ve since been trying to force the data import, a maneuver that requires distributing valid meta-data.
Despite the best efforts of our technical teams to try to restore the data in the affected storage unit, we are currently not able to recover them. The result of this operation, at the time of this posting, is uncertain.
This type of incident is extremely rare and in this case is limited to a single storage unit.
We will provide a full postmortem as soon as we can.
We’re very sorry for this truly unfortunate incident and we offer our sincere apologies to anyone impacted.
The Gandi team