Published at 21.03.2018
Table of Contents
The emergence of reliable infrastructures — virtual data centers controllable via API — enable new automation strategies in software operations.
The usage of persistent disks to store important data on remotely attached block devices makes virtual machines disposable. These ephemeral VMs do not need intense protection as their re-creation is cheaper than their repair.
This strategy is represents a universal way of repairing virtual machines. The subsequent thought is just around the corner: why not automate the combination of monitoring VM health and trigger the universal VM resurrection strategy on failure? The resulting software layer is not very complex but provides a significant amount of self-healing capabilities to the system.
The concept of ephemeral VMs also enables advanced rollback strategies. Consider a complex update of a database server. An OS distribution upgrade, for example. In theory, this shouldn’t affect the database. But reality teaches differently. If such an update fails, it’s barely possible to turn back the state of a virtual machine.
With keeping the persistent disk of a database separate, the data of the database is safe. A new virtual machine with a new OS version may be created. At a time the old VM is turned off and the persistent disk is attached to the new VM.
In case something doesn’t work with the new VM, the old VM can be restarted with the persistent disk reattached. You will see that this strategy has also been translated to the latest container technologies as it is a proven architectural pattern.
With this paradigm becoming more popular, Devops automation technologies have adapted to it. A new generation of Devops tools emerged inherently embracing the programmable nature of modern infrastructures. One of these next-gen Devops tools is BOSH.
BOSH has been developed in the context of Cloud Foundry but its potential is far greater than operating a single platform technology. It is no coincidence that an extremely well designed Devops tool emerges in the context of operating a large application platform technology.
A multi-tenant capable application platform like Cloud Foundry, designed to run thousands of applications, represents a complex distributed system comprised of dozens of microservices, data services. Such a system with both many microservices and larger numbers of instances of them requires a fully grown, highly automated operational model.
Researching into application platforms quite a few examples can be found where application platforms have been well designed but struggled with their operational models. They either ran into quality issues or have been stuck on a certain public infrastructure provider, sometimes even in a certain region. For years! Especially the lock-in nature of proprietary infrastructure APIs hit hard when the competition began to spread globally. It hit again, when another market opened and more and more enterprise customers asked for on-premise application platforms. Therefore, they key success factor for a modern application platform is its operational model which in turn depends on a modern, well designed automation tool such as BOSH.
BOSH’s potential is huge and its architecture brilliant. Brilliant enough to have a closer look at BOSH’s capabilities and learn how a modern automation technology needs to look like.
For starters, BOSH provides:
True operating system independence
True infrastructure independence
Multiple levels of self-healing
Predictability and repeatability of deployments
Great scalability including
An API to automate against
And covers the entire lifecycle of complex, large-scale distributed systems including:
Automating the lifecycle of large-scale distributed systems is the challenge.
The point of briefly presenting BOSH is that modern automation tools have to be comprehensive. Automating the lifecycle of large-scale distributed systems is the challenge. Installing a piece of software is not a great achievement any more. It’s about taking care of all operational aspects from its cold deployment to all patch level, minor and even major updates the software goes through during its lifecycle.
Before looking into application platforms it is worth drawing a line in the sand. Why is a technology such as BOSH required if there are platform tools such as Cloud Foundry? Overly simplified, Cloud Foundry is a gigantic application server grid on steroids. Like the super version of a large-scale shared hosting for professional developers. Its main purpose is hosting applications.
People tend to not use these terms anymore but it’s essentially what it is: shared hosting. Slicing a huge cluster into smaller chunks to run workloads of a large number of users who potentially do not share any trust relation among another.
Call these apps cloud-native, call them 12-factor compliant. In any case they are stateless. They rely on so called data services or backing services to maintain state while they remain stateless and thus disposable. You can deploy as many instances of app as often as you want. A bit of routing in front and there you go. Replacing failing instances does not need any ninja-tricks with persistent disks. Just start them somewhere in the cluster.
Cloud Foundry needs to maintain state. Its components such as the Cloud Controller or the UAA for user authentication require data services such as a relational database.
This is where BOSH comes into play. Think of BOSH as the 12factor implementation for stateful applications.
When operating a multi-tenancy application platform the platform operator needs to be professional in both operating stateless and operating stateful applications. Therefore, a tooling for both is required.
Check out the full series here: Evolution of Software Development and Operations