https://syseleven-status.de SysEleven Status and Incidents 2024-12-22T06:19:49.131041+00:00 SysEleven support@syseleven.de python-feedgen https://www.syseleven.de/wp-content/uploads/2020/10/SysEleven_XL_Logo_quer_RGB.png Get all incidents by feed 520 Cloudflare related network problems 2024-12-22T06:19:49.231277+00:00 <p>Affected Components: Setups that route traffic through Cloudflare experience issues</p> <p>Investigation Start: <strong>2024-09-02 13:20 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <p>We are investigating network problems reported by customers that route traffic through Cloudflare</p> <hr /> <p>Customer Impact:</p> <p>Customers that are routing traffic to SysEleven networks through Cloudflare experience high retransmission rates and traffic loss.</p> <hr /> <p><strong>Update: 2024-09-02 14:20 UTC+01:00 (CET)</strong></p> <p>We are still investigating.</p> <hr /> <p><strong>Update: 2024-09-02 14:49 UTC+01:00 (CET)</strong></p> <p>We are observing significant improvements of the retransmit rates on some affected loadbalancers but will continue to observe and analyse the issues.</p> <hr /> <p><strong>Update: 2024-09-02 16:25 UTC+01:00 (CET)</strong></p> <p>Since the Problem seems to be solved for most customers, we declare the incident as solved. Further investigations into the reasons for the problems will continue.</p> 2024-09-02T10:20:00+00:00 522 INCIDENT: SysEleven STACK API issues, region HAM1 2024-12-22T06:19:49.229987+00:00 <p>Affected Components: <strong>SysEleven Stack API, region HAM1 and DUS2</strong></p> <p>Incident Start: <strong>2024-09-04 15:20 UTC+02:00 (CEST)</strong></p> <p>Incident End: <strong>2024-09-04 16:30 UTC+02:00 (CEST)</strong></p> <hr /> <p>Description:</p> <ul> <li>New VMs could not be created</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Spawning new virtual machines (VMs) </li> </ul> <hr /> <p><strong>Update: 16:20</strong></p> <p>We rolled out a configuration change and are observing.</p> <p><strong>Update: 16:30</strong></p> <p>We see that VMs can be created again in our automated tests. We are closing the incident</p> 2024-09-04T11:20:00+00:00 523 INCIDENT: SysEleven STACK Designate API issues 2024-12-22T06:19:49.227984+00:00 <p>Affected Components: <strong>SysEleven Stack Designate API</strong></p> <p>Incident Start: <strong>2024-09-17 20:42 UTC+02:00 (CEST)</strong></p> <p>Incident End: <strong>2024-09-17 22:15 UTC+02:00 (CEST)</strong></p> <hr /> <p>Description:</p> <ul> <li>Managing DNS zones and records through the Designate API may fail.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Loss of control for DNS resources.</li> </ul> <hr /> <p><strong>Update: 2024-09-17 21:20 UTC+02:00 (CEST)</strong></p> <p>At 21:15, we reverted a configuration change that was rolled out earlier today, which fixed the issue. We are watching the situation.</p> <hr /> <p><strong>Update: 2024-09-17 22:15 UTC+02:00 (CEST)</strong></p> <p>Underlying issues that were persistent, but not impacting the API, are now fully resolved. Incident is over.</p> <hr /> 2024-09-17T16:42:00+00:00 525 INCIDENT: SysEleven STACK API issues, region DBL 2024-12-22T06:19:49.226887+00:00 <p>Affected Components: <strong>SysEleven Stack API, region DBL</strong></p> <p>Incident Start: <strong>2024-09-19 08:23</strong> Incident End: <strong>2024-09-19 08:35</strong></p> <hr /> <p>Description:</p> <ul> <li>Loss of control situation for volume and compute services</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Creating new virtual machines (VMs) or changing existing resources is not possible.</li> </ul> <hr /> <p>Update: <strong>2024-09-18 08:35</strong></p> <p>Some nova services also seem to be affected, we investigated the situation and bring the services back up</p> 2024-09-19T04:23:00+00:00 526 INCIDENT: SysEleven STACK Block Storage issues, region CBK 2024-12-22T06:19:49.225669+00:00 <p>Affected Components: <strong>SysEleven Stack Block Storage, region CBK</strong></p> <p>Incident Start: <strong>2024-09-24 10:55 UTC+02:00 (CEST)</strong></p> <hr /> <p>Description:</p> <p>At the moment we are facing issues with the Block Storage in Region CBK.</p> <hr /> <p>Customer Impact:</p> <ul> <li>Volumes may be unavailable</li> <li>Instances booted from a volume may be unavailable</li> </ul> <hr /> <p><strong>Update: 2024-09-24 11:00 UTC+02:00 (CEST)</strong></p> <p>Issue identified and we are fixing the problem.</p> <hr /> <p><strong>Update: 2024-09-24 12:15 UTC+02:00 (CEST)</strong></p> <p>Problem is fixed. Instances using volumes had to be restarted.</p> 2024-09-24T06:55:00+00:00 529 INCIDENT: PVC Failure of a hardware node, region BKI 2024-12-22T06:19:49.224564+00:00 <p>Affected Components: <strong>Hardware node failure, region XXX</strong></p> <p>Incident Start: <strong>2024-10-01 09:15 UTC+02:00 (CEST)</strong> Incident End: <strong>2024-10-01 09:36 UTC+02:00 (CEST)</strong></p> <hr /> <p>Description:</p> <ul> <li>Malfunction of a hardware node</li> <li>Restart of the hardware node is necessary </li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>During the period of restart, there will be a short interruption in the availability of the systems.</li> <li>Affected customers were notified via E-Mail.</li> <li>Please check the affected systems for their full functionality.</li> </ul> 2024-10-01T05:15:00+00:00 530 INCIDENT: SysEleven STACK Network performance issues, region DBL 2024-12-22T06:19:49.223363+00:00 <p>Affected Components: <strong>SysEleven Stack Network, region DBL</strong></p> <p>Incident Start: <strong>2024-10-01 14:17 UTC+02:00 (CEST)</strong></p> <p>Incident End: <strong>2024-10-01 14:34 UTC+02:00 (CEST)</strong></p> <hr /> <p>Description:</p> <p>At the moment, we are facing issues with the Network in Region DBL.</p> <hr /> <p>Customer Impact:</p> <ul> <li>Network performance issue</li> </ul> <hr /> <p><strong>Update: 2024-10-01 14:34 UTC+02:00 (CEST)</strong></p> <p>The incident is over, and all services are operational.</p> <hr /> 2024-10-01T10:17:00+00:00 532 INCIDENT: SysEleven STACK issues in region HAM1 2024-12-22T06:19:49.221870+00:00 <p>Affected Components: <strong>SysEleven Stack, region HAM1</strong></p> <p>Incident Start: <strong>2024-10-10 23:55</strong> Incident End: <strong>2024-10-11 00:00</strong></p> <hr /> <p>Description:</p> <ul> <li>Occurring errors were investigated. </li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Connectivity was restricted</li> </ul> <hr /> <p><strong>Update: 01:00</strong></p> <ul> <li>We can observe further short term issues with the HAM1 region connectivity, the provider is aware of the issues and is currently investigating the situation</li> </ul> <hr /> <p><strong>Update: 01:30</strong></p> <ul> <li>The network provider is proceeding with a network maintenance until 05:00, we are on standby</li> </ul> 2024-10-10T19:55:00+00:00 534 INCIDENT: SysEleven STACK API issues 2024-12-22T06:19:49.220378+00:00 <p>Affected Components: <strong>SysEleven Stack API</strong></p> <p>Incident Start: <strong>2024-10-30 09:30 CET</strong></p> <p>Incident End: <strong>2024-10-30 12:05 CET</strong></p> <hr /> <p>Description:</p> <ul> <li>Accessibility of the SysEleven Stack API is not ensured.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Spawning new virtual machines (VMs) or changing existing resources is not possible.</li> </ul> <hr /> <p><strong>Update: 10:40</strong></p> <p>We are still investigating the situation and are in contact with our external network provider to further analyze the problems.</p> <hr /> <p><strong>Update: 11:30</strong></p> <p>The issue has been identified, we are waiting for our external network provider to further fix the situation.</p> <hr /> <p><strong>Update: 12:05</strong></p> <p>The issue has been resolved.</p> 2024-10-30T07:30:00+00:00 537 INCIDENT: SysEleven STACK Object Storage issues, region DBL 2024-12-22T06:19:49.219251+00:00 <p>Affected Components: <strong>SysEleven Stack Object Storage, region DBL</strong></p> <p>Incident Start: <strong>2024-11-12 17:45 UTC+01:00 (CET)</strong></p> <p>Incident End: <strong>2024-11-12 18:40 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <p>At the moment we are facing issues with the Object Storage in Region DBL.</p> <hr /> <p>Customer Impact:</p> <ul> <li>Writing or reading of objects maybe restricted.</li> </ul> <hr /> <p>Update: <strong>2024-11-12 18:40 UTC+01:00 (CET)</strong></p> <p>We mitigated the problem and do further investigation</p> 2024-11-12T15:45:00+00:00 538 INCIDENT: SysEleven STACK issues in region CBK 2024-12-22T06:19:49.217779+00:00 <p>Affected Components: <strong>SysEleven Stack, region CBK</strong></p> <p>Incident Start: <strong>2024-11-12 22:35</strong> Incident End: <strong>2024-11-13 00:00</strong></p> <hr /> <p>Description:</p> <ul> <li>Occurring errors are currently being investigated.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Connectivity is restricted</li> </ul> <hr /> <p><strong>Update: 23:08</strong></p> <p>The announced maintenance is having a bigger impact than expected, we are investigating the situation</p> <hr /> <p><strong>Update: 23:45</strong></p> <p>We were able to pin down the rootcause and prepare a fix to mitigate the problems</p> <hr /> <p><strong>Update: 12:00</strong></p> <p>The network problems were mitigated. If you still encounter issues please contact us!</p> 2024-11-12T20:35:00+00:00 541 INCIDENT: Partial outage of MetaKube Control Plane Services Region in region FES 2024-12-22T06:19:49.216109+00:00 <p>Affected Components: <strong>MetaKube Control Plane Services, region FES</strong></p> <p>Incident Start: <strong>2024-11-18 11:30 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <p>Infrastructure hosting the MetaKube Control Plane Services has problems.</p> <hr /> <p>Customer Impact:</p> <ul> <li>MetaKube Control Plane might be slow or not answering</li> </ul> <hr /> <p><strong>UPDATE 2024-11-18 12:30 UTC+01:00 (CET)</strong></p> <p>We have identified networking problems as the cause, currently working to resolve them.</p> <p><strong>UPDATE 2024-11-18 13:27 UTC+01:00 (CET)</strong></p> <p>We have increased conntrack table size on hardware nodes to avoid networking problems.</p> <p>We continue to have issues with overloaded pods which we are working on.</p> <p><strong>UPDATE 2024-11-18 14:00 UTC+01:00 (CET)</strong></p> <p>We managed to get the overloaded pods running by isolating them on dedicated nodes and raising the resource limits. This stopped other issues as well.</p> <p>We still need to investigate what caused the overloading of certain pods.</p> <p>Incident is over.</p> 2024-11-18T09:30:00+00:00 543 INCIDENT: Partial degradation of SysEleven IAM services 2024-12-22T06:19:49.214844+00:00 <p>Affected Components: <strong>SysEleven IAM, regions DUS and HAM</strong></p> <p>Incident Start: <strong>2024-11-28 12:00 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <ul> <li>We're currently investigating a service degradation in the SysEleven IAM. Inviting users to an organization is currently not possible.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Inviting users to an organization is currently not possible.</li> </ul> <hr /> <p><strong>UPDATE 2024-11-28 13:10 UTC+01:00 (CET)</strong></p> <p>The issue has been resolved and inviting users to organizations is possible again</p> 2024-11-28T10:00:00+00:00 544 INCIDENT: major outage of metakube control plane services in ham1 2024-12-22T06:19:49.213594+00:00 <p>Affected Components: <strong>metakube control plane services, region ham1</strong></p> <p>Incident Start: <strong>2024-11-29 11:00 UTC+01:00 (CET)</strong></p> <p>Incident End: <strong>2024-11-29 13:40 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <p>The metakube control plane services in ham1 can't be reached currently due to slow i/o</p> <hr /> <p>Customer Impact:</p> <ul> <li>Metakube services e.g. clusters in ham1 can't be reached</li> </ul> <hr /> <p>Customer Actions:</p> <ul> <li>Please inform us if you notice any irregularities</li> </ul> <hr /> <p>Update 13:23</p> <p>The situation improved.</p> 2024-11-29T09:00:00+00:00 545 INCIDENT: SysEleven STACK Storage issues, region HAM1 2024-12-22T06:19:49.212507+00:00 <p>Affected Components: <strong>SysEleven Stack, Storage, region HAM1</strong></p> <p>Incident Start: <strong>2024-11-29 11:00 UTC+01:00 (CET)</strong></p> <p>Incident End: <strong>2024-11-29 13:40 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <p>We are facing issues with the distributed file system, a core component of the SysEleven Stack.</p> <hr /> <p>Customer Impact:</p> <ul> <li>Starting of virtual machines (VMs) partially not possible.</li> <li>Writing Access to volumes (VM disks) maybe restricted.</li> </ul> <hr /> <p>Update 13:23</p> <p>The situation improved, storage access latencies are back to normal.</p> 2024-11-29T09:00:00+00:00 546 INCIDENT: SysEleven STACK API issues, region FES 2024-12-22T06:19:49.211138+00:00 <p>Affected Components: <strong>SysEleven Stack API, region FES</strong></p> <p>Incident Start: <strong>2024-12-04 19:07 UTC+01:00 (CET)</strong></p> <p>Incident End: <strong>2024-12-04 20:10 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <ul> <li>Accessibility of the SysEleven Stack API is not ensured.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Requests on OpenStack API may return an error code</li> <li>Spawning new virtual machines (VMs) or changing existing resources may fail</li> </ul> <hr /> <p><strong>Update: 2024-12-04 20:00 UTC+01:00 (CET)</strong></p> <p>We identified the likely root cause and a fix is being applied.</p> <hr /> <p><strong>Update: 2024-12-04 20:10 UTC+01:00 (CET)</strong></p> <p>OpenStack API is now working again as expected.</p> 2024-12-04T17:07:00+00:00 547 INCIDENT: SysEleven STACK issues in region dus2 2024-12-22T06:19:49.209972+00:00 <p>Affected Components: <strong>SysEleven Stack, region dus2</strong></p> <p>Incident Start: **2024-12-06 17:45 UTC+01:00 (CET)</p> <hr /> <p>Description:</p> <ul> <li>We are seeing some network connectivity issues in the region and are investigating.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Connectivity is degraded for MetaKube Services in dus2 </li> <li>Connectivity to Database as a Service in dus2 is degraded</li> </ul> <hr /> <p><strong>Update: 2024-12-06 18:45 UTC+01:00 (CET)</strong></p> <ul> <li>We identified an issue with one of our gateways and are working on a fix</li> </ul> 2024-12-06T15:45:00+00:00 548 INCIDENT: Partial outage of Control Planes in Regions FES, DBL, CBK 2024-12-22T06:19:49.208428+00:00 <p>Affected Components: <strong>Control Planes in Regions FES, DBL, CBK</strong></p> <p>Incident Start: <strong>2024-12-09 18:30 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <p>Some cluster control planes are not reachable.</p> <hr /> <p>Update (22:20 CET):</p> <p>We have found a way to mitigate the issues temporarily and start applying the fix.</p> <p>Update (22:50 CET):</p> <p>We applied the fix everywhere. We don't see any broken clusters anymore.</p> <hr /> <p>Update <strong>2024-12-10 12:30 UTC+01:00 (CET)</strong>:</p> <p>We identified the root cause: An unintended side-effect of an upgrade to a kube-proxy setting changed the proxy mode. This created iptables rules which were not cleaned up and through which traffic was dropped.</p> <p>We are taking measures to prevent this in the future.</p> 2024-12-09T16:30:00+00:00 549 INCIDENT: SysEleven STACK network issues in region DBL 2024-12-22T06:19:49.206925+00:00 <p>Affected Components: <strong>SysEleven Stack, region DBL</strong></p> <p>Incident Start: <strong>2024-12-13 14:09 UTC+01:00 (CET)</strong></p> <p>Incident End: <strong>2024-12-13 16:09 UTC+01:00 (CET)</strong></p> <hr /> <p>Description:</p> <ul> <li>DBL network has performance issue</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Network performance degradation </li> </ul> <hr /> <p><strong>Update: 2024-12-13 15:17 UTC+01:00 (CET)</strong></p> <p>Situation is back to normal.</p> <hr /> <p><strong>Update: 2024-12-13 15:57 UTC+01:00 (CET)</strong></p> <p>We notice performance degradation again.</p> <hr /> <p><strong>Update: 2024-12-13 16:09 UTC+01:00 (CET)</strong></p> <p>Situation is back to normal.</p> 2024-12-13T12:09:00+00:00 550 INCIDENT: SysEleven STACK issues in region DBL 2024-12-22T06:19:49.205033+00:00 <p>Affected Components: <strong>SysEleven Stack, region DBL</strong></p> <p>Incident Start: <strong>2024-18-12 00:47</strong> Incident Start: <strong>2024-18-12 01:27</strong></p> <hr /> <p>Description:</p> <ul> <li>Occurring errors are currently being investigated.</li> </ul> <hr /> <p>Customer Impact:</p> <ul> <li>Connectivity is restricted</li> </ul> <hr /> <p>Update: <strong>2024-18-12 01:27</strong></p> <ul> <li>We could see problems with cross region connectivity outgoing from the DBL region, between 00:35 - 01:10, at the moment traffic seems to have normalized again, we are still investigating</li> </ul> <hr /> <p>Update: <strong>2024-18-12 02:00</strong></p> <ul> <li>Device causing the network issues was identified, root cause will be further investigated</li> </ul> 2024-12-17T22:40:00+00:00