Talking VM availability at the London PaaS user group meetup

Thursday August 14th we attended the London PaaS User Group (LOPUG) meetup at OpenCredo and gave a talk while we were at it. But only after Richard Davies, CEO and co-founder of ElasticHosts Ltd, talked a bit about Linux Containers used in tools such as LXC and Docker, their use in IaaS and PaaS platforms, and their benefits over traditional visualized servers.

Next it was up to me to share something on combining Cloud Foundry and OpenStack. Speaking from experience – anynines runs a public Cloud Foundry on a self-hosted OpenStack infrastructure – I shared some of the challenges and benefits of running on OpenStack. And mentioned virtual machine availability, failover methods and implications on service design. And I pointed at things.

The anynines stack is built up from rented hardware in a datacenter, with (initially) VMware and then Cloud Foundry on top of that. We then migrated from a rented VMware to a self-hosted OpenStack (because of reasons).

OpenStack upgrades

Our OpenStack run Cloud Foundry has been running for more than 6 months. We started looking into OpenStack maybe 2 years ago, its major release back then was Diabolo. We learned a lot over time. Before Grizzly OpenStack was not ready for production. The update process included a lot of manual work, there were no automated (script driven) upgrades. With manual database schema migrations and configuration file changes the risk of breaking stuff was tremendously high. We would usually just wipe all VM’s, install the upgrades and hope for the best.

With Grizzly things changed and our sysops were optimistic that we could run Cloud Foundry on top of OpenStack. We still ran our OpenStack setup alongside our VMware setup, to make sure everything runs smoothly. The switch from Havanna to Icehouse was the next upgrade on our list. This was the first production upgrade – which is exciting. We used Chef to roll-out Icehouse including its configuration changes and the upgrade was well tested on a separate multi-server OpenStack staging system.

Rolling upgrades are supported with Icehouse on. The promise is that you don’t have shut down VM’s doing updates. No downtime of the entire cloud.

VM availability

OpenStack is not VMware and we have seen some VM’s dying, Pivotal has seen a similar problem when it switched from VMware to AWS (or PWS). VMware’s high availability features are pretty neat. So what kills VM’s? In our case: random kernel panics (kernel bug) and hardware outages.

OpenStack comes with a concept that’s called availability zones. You build disjunct networks, racks, etc and each disjunct zone is an availability zone. You tell OpenStack about these availability zones. Whenever you provision a virtual machine you can choose the availability zones for the VM’s and build your Bosh releases accordingly.

Aggregates

OpenStack aggregates are similar to the availability zones concept, although the intention is not about ‘failing over’ something but selecting hosts with certain attributes (e.g. SSD-aggregate). Where availability zones take care that outages don’t escalate too much, aggregates help you pick VM’s from hosts with attributes.

Load balancing

OpenStack’s load balancer is not inherently clustered at the moment, and thus a single point of failure. But a LBaaS failover can be realized using the pacemaker/corosync and GlusterFS.

VM Failover strategies

In contrast to VMware where we can rely on highly available virtual machines, much like Amazon we wanted to have less expensive hardware, with the probability that hardware would go down. To harden our systems on all layers, we defined three VM failover strategies:

  1. Resurrect – where you place a monitor in your VM, detect failing ones and trigger a re-build of VM’s automatically (e.g. using Cloud Foundry Bosh). Which is relatively easy, but it takes long (minutes, not seconds) and OpenStack doesn’t release persistent disks automatically.
  2. Failover to Standby VM – where you provide a stand-by VM, monitor your VM(s) and perform an IP failover using tools like Pacemaker. Which is fast, although pacemaker is not easy to use. Plus: you introduce increased resource usage with stand-by VM(s)…
  3. Service Failover – HA Postgres – Postgres is not inherently ‘clusterable’, which is why you perform the failover with a stand-by VM. An IP-Failover using NIC-reattachment is half way towards a PostgreSQL Cloud Foundry Service. You will want to add a V2 service broker and provisioning logic.

Wardenized services (community services) are cute for pet projects, yet not suitable for production. Implementations are often outdated, and more importantly: one size doesn’t fit all. There’s no production-ready Cloud Foundry without high quality, clusterable services.

Leave a Reply

Your email address will not be published.