Friday, 25 September 2015

Force12 on Mesos

Today we’re launching a new version of our microscaling demo running on the Mesos platform. This adds to our existing demo for ECS (EC2 Container Service).

Microscaling is starting and stopping containers in real time based on current demand.  This is possible due to the much faster startup and shutdown times of containers compared with virtual or physical servers.

Mesos Demo

This post explains the architecture of the demo and how we’ve tuned it to launch containers 33% faster than on ECS.



CoreOS and Fleet

Our EC2 instances are running CoreOS, which is a Linux distribution that is optimized for running containers. We start a fleet cluster on these instances and use it to bootstrap our Mesos cluster.

We’ve released the setup code for this and you can run the cluster locally with Vagrant. I’ve written a guest post for Packet’s blog explaining it in more detail. You can also use the code to spin up a cluster on their high-end physical servers.

Mesos

We run all our Mesos components as Docker containers within CoreOS.  The Mesos Master instance runs the Mesos Master container and ZooKeeper. Each of the 3 “Agent” instances runs a Mesos Agent container.

Service discovery was the trickiest part of the setup.  Mesos has to use ZooKeeper for service discovery but it doesn’t have a DNS interface. This is a problem since EC2 instances are assigned random IPs. So we use the Consul service discovery tool from Hashicorp, which does have a DNS interface.

Marathon

The Mesos Master instance also runs a Marathon container. Marathon is a powerful scheduler from Mesosphere than runs on top of Mesos.  Our Force12 scheduler integrates with Marathon via its REST API.

We see Force12 as a container scheduler that cooperates with other schedulers. So we focus on microscaling and using your servers more efficiently. We integrate with schedulers like Marathon that provide fault tolerance for your services.

Force12 containers

Our Force12 scheduler runs within Marathon, and starts and stops Priority 1 (High priority) and Priority 2 (Low priority) containers based on the random (simulated) demand metric. This metric is set by the demand-rng container and stored in Consul using its Key / Value store.

Mesos / Marathon tuning

For the demo we want to push at the limits of what is possible with micro scaling - but still keep our cluster stable and without taking “heroic measures”. Here is what we’ve tuned so far.

Mesos Master – allocation interval

We reduced this command line parameter from 1 second to 100 ms to reduce the wait between allocations. Thanks to Daniel Bryant of OpenCredo for the tip on this. It’s worth noting this setting is only recommended on small Mesos clusters like ours.

https://open.mesosphere.com/reference/mesos-master/

Marathon – max tasks per offer

This is another command line parameter that we increased from 1 to 3 tasks.  This gave a big speed increase because our containers are launched in parallel rather than sequentially. We set the number of tasks to 3 because our random demand metric changes +- 3 containers. Task is the Mesos terminology but in our case each task is a single Docker container.

https://mesosphere.github.io/marathon/docs/command-line-flags.html

Private Registry

For the demo our Priority 1 & 2 containers use a small image based on the BusyBox Linux distribution. This image is only 1.1 MB but each time Mesos launches a task it does a docker pull to check the image is up to date.

This means we do around 45,000 docker pulls a day. So we’ve set up a private Docker repository for this image rather than pulling it from Quay.io, which we use for other images.

Restarting Mesos Agent containers

Our demo containers simply run a bash script in a loop as a work task. This means the load in the demo is from the microscaling rather than the containers themselves.
For the Mesos demo we found that the bottleneck on the cluster was CPU on the Mesos Agent servers. The CPU load would build up over time and after about 3 hours the cluster performance would degrade massively and stop tracking the demand metric.

As a workaround we restart a Mesos Agent container each hour across each of the 3 servers in turn. Mesos handles these restarts gracefully and all the demo containers can be run on just 2 cluster nodes.

Comparison with ECS

The Mesos demo can handle changing its demand by +- 3 containers rather than +- 2 for our ECS demo. The demand metric also changes every 3 secs instead of every 4 secs. So the Mesos demo has a 33% increase in speed and a big increase in the number of containers we launch each day over the ECS demo.

This was possible because using Mesos gives us more control over the cluster allowing us to do more tuning. With ECS we can only call the API and wait for the ECS Agent to start or stop our containers.

The tradeoff is that setting up Mesos is much more complex as we have to bootstrap the cluster ourselves. In contrast with ECS to form a cluster you simply launch some EC2 instances that are running the ECS Agent.

However we think microscaling can provide major efficiency savings on both platforms. This is why we’re developing Force12 as a platform agnostic solution that will run on all the major container platforms.

No comments:

Post a Comment