Timezone: Europe/Berlin (CET - UTC +1/CEST - UTC +2)
We provide current information about infrastructure and service availability below. If you experience service impacts or performance issues please contact our helpdesk or our Service Desk.
MetaKube Dashboard / APIINCIDENT: MetaKube Control Plane issues, region FES
Affected Components: MetaKube Control Planes, region FES
Incident Start: 2025-02-11 06:00 UTC+01:00 (CET)
State: Resolved
Description:
Accessibility of the MetaKube API is not ensured.
After a scheduled maintenance to the network in FES, the MetaKube control cluster (which is hosting the customer control planes) has problems reaching DNS. This is causing issues to the customer control planes.
This also affects Database as a Service and Observability as a Service
All times below are CET
UPDATE 2025-02-13 12:20
We consider all service disruptions of the incident to be mitigated
Although we are not expecting any more service disruptions, we are still watching all systems closly
Previous Updates in reverse chronological order
UPDATE 2025-02-11 10:00
Still investigating the DNS issue. We sent out a notifier to all potentially affected customers.
UPDATE 2025-02-11 11:15
We have used the time since the last update to narrow down the root cause of the incident. We excluded some possibilites but did not find the root-cause. We are now preparing a partial rollback to downgrade the SDN again.
UPDATE 2025-02-11 12:05
We completed a partial OVN/SDN downgrade, however this has not yet resolved the incident.
We are investigating further
UPDATE 2025-02-11 12:25
We’re exploring further downgrade approaches (previous rollbacks were, as announced, partial) and are in parallel investigating further.
UPDATE 2025-02-11 13:00
As we originally updated the SDN due to a critical security gap, it was decided that we will not perform a full OVN/SDN rollback to the initial state.
We have now activated several teams who will be developing and evaluating different solutions until 1.30 pm. An update on how we proceed will follow then.
UPDATE 2025-02-11 13:50
Our Teams will continue developing and evaluating solutions in break out session as there are further leads but no breakthrough, yet.
In parallel we are preparing a failover for IAM and Alloy to DUS/HAM
UPDATE 2025-02-11 15:30
Our teams investigation in SDN traffic loss is ongoing
Our teams continue developing and evaluating solutions and possible workarounds
In parallel we are evaluating a rebuild of the SDN (software defined network)
UPDATE 2025-02-11 17:47
Part of the services are still not functional
We are still working hard to resolve the issues but we will roll back the update of the SDN if no progress is made.
The planned maintenance period begins today, 11 February 2025, at 23:00 and is expected to last until around 06:00 CET on 12 February 2025, during which time there may be repeated interruptions to services.
The maintenance is also announced via notifier. You will get an info of the end of the maintenance also via notifier.
UPDATE 2025-02-11 20:15
IAM, Alloy and Observability as a Service are restored to full functionality
The Database as a Service API has also been restored, but the API still has some issues which are related to the wider SDN problem. The Databases themselves were at no point affected by the incident.
UPDATE 2025-02-11 23:00
The situation with the API improved. We will continue watching it.
UPDATE 2025-02-12 10:15
The previous maintenance work did not achieve the desired success.
Workarounds have been implemented, so operations should be able to continue without disruptions.
Maintainance work will continue during the upcoming night to fully resolve the incident.
UPDATE 2025-02-12 14:55
We scheduled another maintenance window for this night, February 12th, from 11:00 PM to 6:00 AM the following day.
During this maintenance window, there may be brief interruptions or limited availability of certain services.
The goal is the complete resolution of the incident caused by Monday's update.
An RfO will be available in our Helpdesk after the incident is mitigated.
UPDATE 2025-02-13 12:20
We consider all service disruptions of the incident to be mitigated
Although we are not expecting any more service disruptions, we are still watching all systems closly
UPDATE 2025-02-13 15:30
Incident is resolved
Monday 10th February 2025
No incidents reported
Sunday 9th February 2025
No incidents reported
Saturday 8th February 2025
No incidents reported
Friday 7th February 2025
No incidents reported
Thursday 6th February 2025
No incidents reported
Wednesday 5th February 2025
No incidents reported
Tuesday 4th February 2025
No incidents reported
Monday 3rd February 2025
No incidents reported
Sunday 2nd February 2025
No incidents reported
Saturday 1st February 2025
No incidents reported
Friday 31st January 2025
No incidents reported
Thursday 30th January 2025
No incidents reported
Wednesday 29th January 2025
No incidents reported
Tuesday 28th January 2025
No incidents reported
Monday 27th January 2025
No incidents reported
Sunday 26th January 2025
No incidents reported
Saturday 25th January 2025
No incidents reported
Friday 24th January 2025
No incidents reported
Thursday 23rd January 2025
No incidents reported
Wednesday 22nd January 2025
No incidents reported
Tuesday 21st January 2025
SysEleven Database as a Service APIINCIDENT: minor outage of Database as a Service
Affected Components: Database as a Service, all regions