The Gandi Community

[Resolved] Simple hosting incident

The incident of November 11th is part of a series of incidents over the past few weeks caused by the gateway units, which provide Internet access for the Simple Hosting instances.
The Simple Hosting platform has experienced a number of different issues, principally with the gateway equipment, which seems to be the weakest link in the architecture. It is suject to:
  • HSRP instability causing short interruptions in connectivity,
  • Saturation of NAT translation tables as a result of a number of factors, including DDoS and Customer Activity, 
  • High CPU usage under certain conditions.
What will Gandi do to fix the situation, replace this gateway and improve the Simple Hosting product ?
  • Replace the network equipment which provides the gateway to Internet for the Simple Hosting product with more powerful appliances, and greater numbers of units (scaling). The new units will better handle the current load and will support the growth of Simple Hosting instances in the near future,
  • Set up a deeper level of monitoring to better detect technical problems,
  • Implement advanced monitoring to detect abuse from specific instances and enable quicker reaction from our technical team for handling these abuses before they impact the quality of services for all other customers.
We apologise for the inconvenience, and please be assured that our teams are endeavouring to correct these issues in the shortest possible time.