Table of Contents
Wednesday July 23
Design considerations while evaluating, developing and deploying a distributed task processing system
Konark Modi (@konarkmodi) recommends Celery as ‘one of the most robust, scalable, extendable and easy-to-implement framework available for distributed task processing’. Task queues are typically used as a mechanism to distribute work across threads or machines. Dedicated worker processes constantly monitor the queue for new work to perform. Celery communicates via messages, usually using a broker to mediate between clients and workers. To initiate a task (a unit of work) a client puts a message on the queue, the broker then delivers the message to a worker. A Celery system can consist of multiple workers and brokers. For the fanatics: Celery is written in Python.
Systems Integration: The OpenStack success story
Flavio Percoco (@flaper87) works for Red Hat and is a member of RDO. He is also a GSOC mentor and Rust language contributor. For his EuroPython talk on OpenStack, Flavio focused on some of the existing integration strategies that are applicable to cloud infrastructures and enterprise services.
System integrations are the way a set of subsystems work together to a shared purpose. Flavio talked us through different types of integrations, like vertical integration, star integration – which is actually more like spaghetti integration as all systems know of the other and communicate to one-another – and horizontal integration where service a, service b and service c talk to the communication bus, that then organizes the communication.
From an applications point of view using files to integrate services is probably the oldest method. Some people would use it as a messaging queue, which is a solution for very few and very specific use-cases. Databases are asynchronous data-wise which is great when integrating services, but they are not a message broker. Yet databases are probably most commonly used for integrations. Messaging on the other hand is loosely coupled, yet adds more complexity. Most commonly used for notifications, messaging may depend on message routers and transformations. RPC (Remote Procedure Calls) is the method most used throughout OpenStack. Whereas the message channel may vary (database, broker, etc), its drawback is that RPC is tightly coupled.
OpenStack has a Shared Nothing Architecture (units don’t share memory space or anything else for that matter and know very little about the other services). Databases and RPC function inter-service, messaging cross-service.
Scaling brokers is hard, Flavio said. Depending on your use-case you need a lot of memory and storage. Flavio prefers federation over centralization, like AMQP 1.0 and Message Router. Talking tooling, Flavio recommends checking out Kombo (messaging), Celery (Distribute Task Messaging) and Oslo Messaging (RPC).
Supercharge your development environment using Docker
Deni Bertovic (@denibertovic) works at goodcode and is pretty passionate about Docker. “These days applications are getting more and more complex. It’s difficult to keep track of all the different components an application needs to function plus it keeps getting harder to setup new development environments for developers new to the team.” Deni tells us that it is important that we have our development environment as close to production as possible.
In his introduction to Docker he briefly gets into its ‘images’, containers, hubs and dockerfiles. He uses Docker to streamline development processes for his Django apps.
So what if you already use a VM or vagrant? “Containers are fast and you can start MANY of those at the same time. It’s easy to run the whole production stack locally. Plus: it enables everyone in the team to use the (exact) same databases, libraries, etc.”
Deni talked about automation using makefiles or utilizing the Docker remote API when makefiles don’t suffice. DotCloud’s docker-py library helps you write your script.
Using Chef or Puppet or Ansible is a possibility, yet not a very non-trivial one. Fig handles what Vagrant does for virtual box, for Docker. What’s more is that Fig will be integrated in Docker.
Not everyone in the team needs to understand Docker internals, Deni continued. And the possibility of upgrading separate components easier is just one of the reasons to start using Docker today.
For lack of a better name(server): DNS Explained
Lynn Root (@roguelynn) knows the pain of a
git push that fails (to show any improvement on your website). And that it’s probably a ‘DNS thing’. Trying to fix DNS without a solid understanding of how it works is a bad idea.
DNS (Domain Name System) is ‘a distributed storage system for Resource Records (RR)’. To play around with DNS a little, Lynn used Scapy to sniff her own DNS traffic as she’s browsing. Typing in
roguelynn.com into Chrome’s address bar, we see a DNS query take place for every autocomplete guess that Chrome took. It first pings
www.google.com because the address bar is also Google search. Then, typing
r, it autocompletes to
reddit.com and we can see the DNS query on the second line. Typing
ro, Chrome guesses
roguelynn-spy.herokuapp.com and we can see its related query. Finally it finds
roguelynn.com. These autocompleted DNS queries seem more of a thing that Chrome does (and perhaps other browsers) to speed up navigation to frequented sites.
All of these DNS querys have a dot at the end, The difference between the trailing dot and the absence of such is the same difference between absolute file paths and relative file paths, e.g.
/Users/lynnroot/Dev/site/static. Cool stuff. For an explanation of ‘where my my queries going’, caching, and other nerdy stuff, check Lynn’s blog post on the topic.