Using BOSH multi-CPI feature to deploy to different IaaS

Using BOSH multi-CPI feature to deploy to different IaaS | Benjamin Guttman - DevOps anynines Blogpost Header Image

Introduction

We have a vSphere installation with two data centers and thought about the possibility of adding a third availability zone without requiring an additional DC for a while. Therefore we thought about the option to move the third availability zone to AWS. With the multi-CPI feature introduced in BOSH version v261+, the initial blocker was removed.

Theoretically, we are now able to deploy to different infrastructures but while setting up some test deployment I faced some interesting questions that were not clearly answered in the BOSH docs or in certain blog posts I found about this topic so I decided to share my struggles and share my possible solutions with you.

CPI

As I already said BOSH supports the configuration for several CPIs but the documentation seems just to cover the case where you want to deploy to different regions (AWS, GCP) or different datacenters (vSphere) for the same infrastructure, but it shouldn’t be that hard to deploy to two different infrastructures right? 

The first thing we’ll need is the correct CPIs packed onto our BOSH director. As we are using bosh-deployment to deploy the director, this should not be too hard. We just add the correct ops file and run bosh create-env, but for ops files the order matters. If you use the wrong order for the CPI ops files, your cloud_provider block will have the wrong configuration.

Question 1: Which CPI ops file to apply first? 

As the CPI also includes information needed to create the director, I decided to apply the CPI for the infrastructure the director is deployed to last, so we ensure that all information needed for the CPI to deploy the director is available. An example create-env command could look like that:

bosh create-env bosh.yml
  -o uaa.yml
  -o credhub.yml
  -o jumpbox-user.yml
  -o aws/cpi.yml
  -o vsphere/cpi.yml
  --vars-store creds.yml
  --vars-file vars.yml
  --state state.yml

Cloud Config

After we got our BOSH Director up and running and all CPI configs in place, we need to check our cloud-config for possible adjustments. In my case this base cloud-configuration was used to deploy on a vSphere environment:


azs:
- cloud_properties:
    datacenters:
    - clusters:
      - Cluster01:
          resource_pool: Test_Cluster01
      name: nameme
  name: z1
- cloud_properties:
    datacenters:
    - clusters:
      - Cluster02:
          resource_pool: Test_Cluster02
      name: nameme
  name: z2
- cloud_properties:
    datacenters:
    - clusters:
      - Cluster03:
          resource_pool: Test_Cluster03
      name: nameme
  name: z3
compilation:
  az: z1
  network: compilation
  reuse_compilation_vms: true
  vm_type: compilation
  workers: 2
disk_types:
- cloud_properties:
    type: thin
  disk_size: 2048
  name: small
- cloud_properties:
    type: thin
  disk_size: 4096
  name: medium
- cloud_properties:
    type: thin
  disk_size: 6144
  name: big
- cloud_properties:
    type: thin
  disk_size: 10144
  name: large
- cloud_properties:
    type: thin
  disk_size: 20124
  name: xlarge
networks:
- name: net
  subnets:
  - az: z1
    cloud_properties:
      name: Cluster01_TEST-1
    dns:
    - 8.8.8.8
    - 8.8.4.4
    gateway: 10.0.1.1
    range: 10.0.1.0/24
    reserved:
    - 10.0.1.1  - 10.0.1.10
    - 10.0.1.200 - 10.0.1.255
  - az: z2
    cloud_properties:
      name: Cluster02_TEST-1
    dns:
    - 8.8.8.8
    - 8.8.4.4
    gateway: 10.0.2.1
    range: 10.0.2.0/24
    reserved:
    - 10.0.2.1  - 10.0.2.16
    - 10.0.2.18 - 10.0.2.254
  - az: z3
    cloud_properties:
      name: Cluster03_TEST-1
    dns:
    - 8.8.8.8
    - 8.8.4.4
    gateway: 10.0.3.1
    range: 10.0.3.0/24
    reserved:
    - 10.0.3.1  - 10.0.3.16
    - 10.0.3.18 - 10.0.3.254
  type: manual
- name: compilation
  subnets:
  - az: z1
    cloud_properties:
      name: Cluster01_TEST-1
    dns:
    - 8.8.8.8
    - 8.8.4.4
    - 8.8.8.8
    - 8.8.4.4
    gateway: 10.0.1.1
    range: 10.0.1.0/24
    reserved:
    - 10.0.1.1 - 10.0.1.200
vm_types:
- cloud_properties:
    cpu: 1
    disk: 4096
    ram: 1024
  name: nano
- cloud_properties:
    cpu: 1
    disk: 10000
    ram: 4096
  name: small
- cloud_properties:
    cpu: 2
    disk: 20000
    ram: 4096
  name: medium
- cloud_properties:
    cpu: 4
    disk: 20000
    ram: 4096
  name: big
- cloud_properties:
    cpu: 4
    disk: 60000
    ram: 8192
  name: large
- cloud_properties:
    cpu: 20
    disk: 60000
    ram: 16384
  name: xlarge
- cloud_properties:
    cpu: 20
    disk: 20000
    ram: 8192
  name: compilation

The first part we check for adjustments is the ‘availability_zone’ definition, which looks like this  at the moment:


azs:
- cloud_properties:
    datacenters:
    - clusters:
      - Cluster01:
          resource_pool: Test_Cluster01
      name: nameme
  name: z1
- cloud_properties:
    datacenters:
    - clusters:
      - Cluster02:
          resource_pool: Test_Cluster02
      name: nameme
  name: z2
- cloud_properties:
    datacenters:
    - clusters:
      - Cluster03:
          resource_pool: Test_Cluster03
      name: nameme
  name: z3

What we need to do now is to add certain availabilities for AWS, in our example, we will add a `z4` for AWS now.


azs:
- cloud_properties:
    datacenters:
    - clusters:
      - Cluster01:
          resource_pool: Test_Cluster01
      name: nameme
  name: z1
  cpi: vsphere_cpi
- cloud_properties:
    datacenters:
    - clusters:
      - Cluster02:
          resource_pool: Test_Cluster02
      name: nameme
  name: z2
  cpi: vsphere_cpi
- cloud_properties:
    datacenters:
    - clusters:
      - Cluster03:
          resource_pool: Test_Cluster03
      name: nameme
  name: z3
  cpi: vsphere_cpi
- cloud_properties:
    availability_zone: eu-central-1a
  name: z4
  cpi: aws_cpi

Did you notice that we added the CPI name here to tell BOSH which availability zone needs to get targeted with which CPI? The names you can use here for the CPIs are defined via the CPI configs which we will have a look at in a couple of lines. 

But before, we will check the remaining parts of the cloud-config for adjustments, that means the disk_types and the vm_types:


disk_types: # (vsphere)
- cloud_properties:
    type: thin
  disk_size: 2048
  name: small
disk_types: # (aws)
- cloud_properties:
    type: gp2
  disk_size: 2048
  name: small

When comparing the respective part from an AWS cloud-config and a vSphere cloud-config we can see that the type property is used for both infrastructures. For vSphere we set the value `thin` for AWS we used `gp2`:

So how do we tell the CPI which type should be used?
Do we need to create separate disk_types for every infrastructure?
And if yes, how do we use them in the manifests? 

So actually this issue can be solved via the CPI config. Having a look at the CPI Configuration  for AWS and vSphere we can see that for AWS the disk_type defaults to ‘gp2’ so we do not need to explicitly configure it for AWS and for vSphere we got a global property named ‘default_disk_type’ which means we can set a default_disk_type via the cpi config. We will have a closer look at that right after this section. So by removing the unneeded values, we get the following result:


disk_types: # (vsphere|aws)
  disk_size: 2048
  name: small

Last step to check is the vm_type definition: 


- cloud_properties:
vm_types: #(vsphere)
- cloud_properties:
    cpu: 1
    ram: 1024
  name: xsmall

vm_types: #(aws)
- cloud_properties:
    instance_type: t2.micro
  name: xsmall

The cloud_properties needed for the AWS/ vSphere CPI differ, so they will not get overwritten by each other and we can just merge them into one. Like this, every CPI will take the information needed to create the VM.


vm_types: #(vsphere|aws)
- cloud_properties:
    cpu: 1
    ram: 1024
    instance_type: t2.micro
  name: xsmall

Last thing that is missing in the cloud-config is the network for the 4th availability_zone:


  - az: z4
    cloud_properties:
      subnet: subnet-
    dns:
    - 10.0.4.2
    - 8.8.8.8
    - 8.8.4.4
    gateway: 10.0.4.1
    range: 10.0.4.0/24
    reserved:
    - 10.0.4.1  - 10.0.4.16
    - 10.0.4.18 - 10.0.4.254

CPI Config

The centerpiece to enable the multi-cpi feature is the CPI config. The CPI config includes all the necessary information to configure the used CPIs. For a general overview, you can have a look at the official BOSH documentation.

In our case the CPI config includes not only the needed credentials and configuration information but also the ‘default_disk_type: thin’ for our vSphere VMs to solve the disk_type issue we discussed earlier:


cpis:
- name: a9s-vsphere
  type: vsphere
  properties:
    host: ((vcenter_ip))
    user: ((vcenter_user))
    password: ((vcenter_password))
    default_disk_type: thin
    datacenters:
    - clusters: ((vcenter_clusters))
      datastore_pattern: ((vcenter_ds))
      disk_path: ((vcenter_disks))
      name: ((vcenter_dc))
      persistent_datastore_pattern: ((vcenter_ds))
      template_folder: ((vcenter_templates))
      vm_folder: ((vcenter_vm_folder))
- name: aws-a9s
  type: aws
  properties:
    access_key_id: ((access_key_id))
    secret_access_key: ((secret_access_key))
    default_key_name: ((default_key_name))
    default_security_groups:
    - ((default_security_groups))
    region: ((region))

One important step to mention here is that after you uploaded the CPI config you need to re-upload the stemcells for the different CPIs. After this was done the output of bosh stemcells look like the following:

As you can see the different stemcells are now distinguished by the CPI they are used for (in our case a9s-vsphere and aws-a9s).

After everything is in place now, I used the following manifest to deploy a Prometheus Alertmanager to both infrastructures AWS and vSphere.


---
name: prometheus

instance_groups:
  - name: alertmanager
    azs:
      - z1
      - z4
    instances: 2
    vm_type: small
    persistent_disk: 1_024
    stemcell: default
    networks:
      - name: net
    jobs:
      - name: alertmanager
        release: prometheus
        properties:
          alertmanager:
            route:
              receiver: default
            receivers:
              - name: default
            test_alert:
              daily: true

update:
  canaries: 1
  max_in_flight: 32
  canary_watch_time: 1000-100000
  update_watch_time: 1000-100000
  serial: false

stemcells:
  - alias: default
    os: ubuntu-xenial
    version: latest

releases:
- name: prometheus
  version: 25.0.0
  url: https://github.com/bosh-prometheus/prometheus-boshrelease/releases/download/v25.0.0/prometheus-25.0.0.tgz
  sha1: 71cf36bf03edfeefd94746d7f559cbf92b62374c

Which will result to

If you are familiar with the style of the VM CID you can here, that z4 shows an AWS styled VM CID and z1 one for vSphere.

So let’s wrap up what needed to be done to use the BOSH multi CPI to deploy to different infrastructures:

  • Add the AZs for the new infrastructure
  • Add the vm_type information needed for new infrastructure
  • Remove properties that can be just used by one CPI and move it to CPI config (e.g. disk_type in cloud-config to default_disk_type in CPI config)
  • Upload a CPI config
  • Add new AZs to your manifest
  • Deploy

I hope this small blog post helps you to easily spread your deployment over different infrastructures. 

 

Leave a Reply

Your email address will not be published. Required fields are marked *