At Zalando we run PostgreSQL at scale: a few hundred database clusters in sizes from a few megabytes up to 10 terabytes of data. What is a bigger challenge than running a high-OLTP multi-terabyte PostgreSQL cluster? It is the migration of such a cluster from the bare-metal data center environment to AWS. There were multiple problems to solve and questions to answer: Which instance type to choose: i3 with ephemeral storage or m4/m5/r4 + EBS volumes? Should we give Amazon Aurora a try? Direct connection from AWS to the data-center is not possible. How to build a replica on AWS and keep it in sync if VPN is not an option. The database (primary and replica) is used by a few hundred employees for ad-hoc queries; ideally, they should retain access through the old connection url. How to do backups of such a huge DB on AWS? We should be able to switch back to the data-center if something goes wrong. In this talk, I am going to provide a detailed account of how we managed to successfully solve all these problems. |