Here is a brief post-mortem/root cause analysis of today's maintenance event:
Starting today at 05:00 UTC we began the scheduled upgrade of our primary database clusters. The goal was to perform a major database version upgrade with a short maintenance downtime (20 minutes). The switch to the new version was executed as planned from 05:20 UTC - 05:31 UTC. During this time all our APIs were serving maintenance responses.
After initially deactivating the maintenance mode, which appeared to mark the successful completion of the upgrade. At this point in time, the team began re-enabling IXOPAY services to verify functionality of the platform.Unfortunately, we began encountering issues causing performance regression starting at 06:05 UTC. The team implemented changes to improve performance and services appeared to have recovered at 06:30 UTC. At this point in time, we continued to take Post Processing & Reconciliation services online. This led to more unforeseen disruptions starting at 06:50 UTC.
As we were not confident in completing the upgrade within the announced maintenance window, the team decided to trigger the prepared rollback procedure. The rollback was effectively implemented at 07:17 UTC. At this point IXOPAY’s services have fully recovered.
The team was confident in having a swift upgrade procedure in place designed for minimal impact. Unfortunately, unforeseen circumstances extended the originally planned maintenance downtime. For such unforeseen failure scenarios, the team had prepared the aforementioned rollback procedure, which was successfully executed to restore our services.
We recognize the importance of our services to our clients and their applications, end users, and businesses at large. In closing, we extend our apologies for the inconvenience this event may have caused.