Tuesday, 12 July 2016

Software Circus and Diversity in Technology


I’m unusual - I’m a woman. That seems a bizarre statement. Nonetheless, it's true in the software business. At Software Circus this September, I won’t be talking about gender because I seldom do. I’ll be speaking about the fascinating future of infrastructure opened up by containers and orchestration.


I’m always very happy to talk about technology; for over 20 years I’ve had a highly enjoyable career in it. Being female I've usually been in the minority but I’ve never felt unwelcome. I’ve never been called names or made to feel uncomfortable by my colleagues. That doesn’t mean it doesn’t happen - it does and it’s always unacceptable - but it’s unusual. It’s so uncommon that in my judgement no woman or any minority should be put off such a rewarding field because of a fear of bullying.  However, the fact remains, there aren’t many of us.


Last month, Mitchell Hashimoto the CEO of Hashicorp said that software was driving the world’s economy. He was right. Software is changing every aspect of life: education, healthcare, the environment, our money, leisure and work. It’s pivotal - and not just to software engineers.  


Decades ago we realised politics was too important to be the province of any one group. It wasn’t easy, but political representation for women and minorities shifted from less than 5% in the late 80’s to around 30% today. There’s still a way to go but we’ve made progress.


I believe that technology is no less important than politics. Everyone has a duty and a right to involve themselves in its future. We cannot afford for a significant proportion of our population to remain apart from the decisions we will make. The recent referendum in the UK has demonstrated that all parts of society must be included in and benefit from progress, or it isn’t really progress.


That’s why it’s great that tech conferences like Software Circus encourage people of all genders, races and backgrounds to learn and have fun together. Conferences are an excellent way to gain understanding of new trends. They are a key part of building networks, improving skills, enjoying your job and building a vision for the future. They’re part of your career path and where you’ll find next year’s industry leaders.


We all want to look out at a conference audience and see an enthusiastic, diverse group with a variety of ideas and opinions.  I know that conferences and meetups like Software Circus, which explicitly encourage diverse attendees do better at getting those mixed audiences and I’m grateful.

I believe that technological progress is as important to the future of the human race as politics and I predict that we can and will do better than the politicians at making it inclusive. It will not be easy but we must do it.

Anne Currie
CTO Microscaling Systems

Tuesday, 21 June 2016

MicroBadger - helping you manage your containers

image5.JPG


Last week at HashiConfEU our own Anne Currie unveiled the MVP of MicroBadger, a tool we’ve been working on to make managing containers easier and safer. In this post we’ll explain the motivation behind MicroBadger; show why it is already gaining support from influential people in the container ecosystem, and of course, we’ll explain how to use it.

Why MicroBadger?

As the number of Docker images in your system increases certain questions become hard to answer.

What is in those images? How can I get back to the exact commit that created this image? How can I tell how many of the containers that I’m running come from an image containing a library with a known bug or security flaw? In fact, do I even know which of the container images I’ve built are running in my cluster?

In 2015 Sonatype released their ‘State of the Software Supply Chain which highlighted some key areas that were not receiving enough attention from the Docker community, the bottom line: we rarely have a good idea about what we are actually running. While this may be manageable for development workloads where developers have an intimate knowledge of what they are working on and the risk is lower, it becomes a ticking time bomb when scaled up for production.

The issue becomes especially prevalent issue at companies running microservice architectures with quick development cycles. They may have hundreds of different services being developed, with each service’s respective CI pipeline pushing out many new version per day. Bitnami captured this well in a recent TheNewStack article: the “...lifecycle and change and configuration management technologies and habits that evolved over 5-10 years of managing virtual infrastructure will need to be re-examined.

Gareth Rushgrove of Puppet Labs has been talking about this problem for some time too and uses the analogy of a real shipping container’s manifest which states exactly what is inside the container.

With Docker this manifest becomes image metadata and since Docker 1.6 there has been a perfectly good method for expressing such metadata; Docker labels. Unfortunately an informal survey conducted in late 2015 concluded that fewer than 20% of images contain any labels at all. With the remaining 80% it is very difficult to figure out exactly what you’re running when something goes wrong.

What does MicroBadger do right now?

Have you ever found an image on the Docker Hub and wondered what code it was built from? Or tried to locate the Docker image for a source code repo?

By labelling containers with the source code details, MicroBadger makes it easy to move with confidence between source code repository and image hub.



Get badges for your code & image. MicroBadger supplies badges for your GitHub Readme file and Docker Hub notes, so that anyone can easily jump from your source code to Docker image and vice versa.

There are three steps to getting your badges:

  1. Push the image to Docker Hub so that it’s publicly available
  2. Look up the image on MicroBadger

If your image is labelled correctly you’ll get markdown that you can then copy-and-paste to the GitHub readme and into the Docker Hub notes.

Some public images such as puppet/puppetserver and centos already include metadata, so they serve as an good basis for exploration.

With MicroBadger we would like to encourage the use of Docker labels by providing a simple way to inspect them. Of course it will require more than just Docker labels to understand a software supply chain, but we chose to start with labels and will be moving to other areas in later releases such as inspecting the Dockerfile and following library dependencies from there.

How can I get involved?

We would like you to label your images, use our badges and give us some feedback. Check out these examples where people are already using labels, badges and metadata:




Friday, 10 June 2016

Scott Guthrie demos Microscaling

Exciting to see Scott Guthrie (EVP of Microsoft's Cloud & Enterprise Group) showing our demo. Thanks @DataSic for grabbing such a good photo!

Monday, 6 June 2016

The Joy Of Organising. Container Image Labeling

Labelling is the new black

In Docker v1.6, Redhat contributed a way to add metadata to your Docker container images using labels. This was a great idea - a standard way to add stuff like notes or licensing conditions to an image. In fact, the only problem with labels is they aren’t used more (<20% of the time by some estimates). Gareth Rushgrove from Puppet has an interesting presentation on the concept

We decided we really needed to add labels to our Microscaling images so we did and here are our thought processes, the tools we wrote and our recommendations for how to do the “right thing”.

Schema and formatting

Labels are just free text key/value pairs, which is flexible but potentially messy -particularly as you can label an image or container multiple times (and containers inherit labels for their images)  and new keys will just overwrite any old keys with the same name.

  • Do namespace your keynames with the reverse DNS notation of your domain (e.g. com.mydomain.mykey) to avoid unplanned overwrites.
  • Don’t use com.docker.x, io.docker.x or org.dockerproject.x as the namespace in your keynames. Those prefixes are reserved.
  • Do use lowercase alphanumerics, . and - ([a-z0-9-.]) in keynames and start and end with an alphanumeric.
  • Don’t use consecutive dots and dashes in keynames.
  • Don’t add labels one at a time with individual label commands. Each label adds a new layer to the image so that’s inefficient. Try to add multiple labels in one call:
LABEL vendor=ACME\ Incorporated \
      com.example.is-beta= \
      com.example.is-production="" \
      com.example.version="0.0.1-beta" \
      com.example.release-date="2015-02-12"

These guidelines are not enforced, but we expect tools to come along to do that. 

Labels to Use

There has been much community discussion about default and orchestrator-specific labels (see references). We reviewed that and decided to add a very minimal initial set of labels to our own images:

"Labels": {
    "org.label-schema.build-date": "2016-06-03T19:17:49Z",
    "org.label-schema.docker.dockerfile": "/Dockerfile",
    "org.label-schema.license": "Apache-2.0",
    "org.label-schema.url": "https://microscaling.com",
    "org.label-schema.vcs-ref": "f03720f",
    "org.label-schema.vcs-type": "git",
    "org.label-schema.vcs-url": "git@github.com:microscaling/microscaling.git",
    "org.label-schema.vendor": "Microscaling Systems"
}

(Updated 14 June 2016: These example labels were updated in this blog post to use a proposed common namespace - see note below).

Adding Labels to your DockerFile

Although you can just hardcode all these values into your Dockerfile, that’s a bit manual (because you can’t have dynamic labels in a Dockerfile). We therefore built a makefile to automate passing dynamic labels into our Dockerfile, which is checked into our GitHub repo.

Licensing

The most controversial label in our opinion is license. Does an image have a single license? It might easily have multiple licences if you are building say a Ruby app on top of Alpine. We therefore expect the license label to potentially be multi-valued (comma separated), for example “Apache-2.0, GPL-3.0”

For consistency we used the approach that each license is a text string license Identifier as defined by SPDX http://spdx.org/licenses/

Go forth and label

It’s very easy to add your own labels to your images. Metadata is very useful  as a way to organise and track images, which can otherwise get pretty out of control!

References



Update 14 June 2016

After talking about these ideas with some folks across the container community, we're starting to work with them on a standardized label schema that we can all use in common. More details to follow very soon, but as a consequence of this we have started to use org.label-schema rather than our own namespace (com.microscaling) for some metadata, and we're updating this post to use that common namespace in the examples. If you'd like to get involved with the discussion of the standard label schema straight away, please let us know on @microscaling.

Tuesday, 17 May 2016

Microscaling Marathon with DC/OS on Azure Container Service

This post shows how you can use our Microscaling Engine with the Marathon container orchestrator. Microscaling makes a system's architecture dynamic – able to respond to real demand in real time. In this demo you'll scale a queue to maintain a target length.


To keep things simple we haven't set up service discovery on the Mesos cluster. Instead we're using an Azure Storage Queue that all the containers can access.



For the demo we'll be creating 4 Marathon Apps that run as Docker containers.
  • Producer - containers add items to the queue. We start 3 producers on startup and you can scale these manually using the Marathon UI.
  • Consumer - containers remove items from the queue and are scaled by the Microscaling Engine.
  • Remainder - any spare capacity is used for this background task. You can change the Docker image to use your own using the Marathon UI.
  • Microscaling - the engine scales the demo and sends data to our Microscaling-in-a-Box site.
For this walkthrough we'll be using a DC/OS cluster running on Azure Container Service. However the Microscaling Engine will integrate with any instance of Marathon.




Install Steps

Azure Storage Account

You'll need an Azure Storage Account and access keys to create the queue.

  • If you don’t currently have an Azure subscription you can get a free-trial.
  • Sign in to the Azure Portal.
  • Navigate to New -> Data + Storage -> Storage Account
  • Create a storage account with these settings.
  • Name - this must be globally unique across all Azure Storage Accounts.
  • Replication - make sure you choose Locally-redundant storage for the queue.
  • Resource Group - create a new resource group for the queue.
  • Once the storage account has been created navigate to Settings -> Access Keys and make a note of your access keys.

Launch ACS Cluster

  • See this tutorial for launching an ACS cluster.
  • In Basics screen set the following
  • USERNAME - e.g. "azureuser". You'll need this for connecting to the cluster via SSH.
  • SSH PUBLIC KEY - also needed for SSH access.
  • RESOURCE GROUP - choose a name for your cluster. 
See this tutorial for generating an SSH key if you don't have one.
  • In ACS Settings screen set the following

  • AGENT COUNT - increase from 1 to 2.
  • DNS PREFIX - choose a name for your cluster.

Open SSH tunnel

  • To access the Marathon REST API and the DC/OS web UIs you need to open an SSH tunnel on port 80 to the master node.

Create Marathon Apps

  • You're now ready to install the Marathon demo apps. Please sign up for a Microscaling-in-a-Box account if you don't already have one.
  • On the Start page choose Mesos / Marathon.
  • The Install page links to this post so you can continue to the Run page.

  • On the Run page you'll see the commands for installing the Marathon Apps. The marathon-install command needs Ruby but has no other dependencies.

  • For MSS_MARATHON_API use http://localhost/marathon if you're using ACS.
  • You'll need to also set AZURE_STORAGE_ACCOUNT_NAME and AZURE_STORAGE_ACCOUNT_KEY with the access keys you created earlier.

View The Demo

  • Once Marathon has launched the apps the results will appear in the Microscaling-in-a-Box site. You'll see the Microscaling Engine scaling the consumer and remainder containers to maintain the target queue length.
  • You can also view the demo in the DC/OS UI using the SSH tunnel. The DC/OS dashboard (http://localhost/), Marathon (http://localhost/marathon) and Mesos  (http://localhost/mesos) UIs are all available.

Cleaning Up

Uninstall the Marathon Apps

  • You can use the marathon-uninstall command to remove the demo apps from your cluster.
MSS_MARATHON_API=*YOUR_MARATHON_API* ./marathon-uninstall

Delete the Azure Resources

Important: You'll be charged for running the Azure VMs and any data stored on the Azure Queue. So its important to delete these when you don't need them anymore.
  • Sign in to the Azure Portal.
  • Select Resource Groups from the left hand menu.
  • Find and delete the 2 Resource Groups you created for the ACS cluster and the Azure Queue.

Next Steps

As well as supporting Marathon and Azure Storage Queues we also support the Docker API and NSQ. We'll be adding support for more queues and container orchestrators.

Stay abreast of all the latest developments by following us on Twitter or starring us on GitHub:


Monday, 25 April 2016

3 Ways DevOps Is Helping Destroy The World

What?!! No really. Here's my doomsday rationale
  1.  A scary proportion of the world’s energy is already spent powering data centres (~2%)
  2. Cloud data centre capacity is growing fast but is inefficiently used (average ~10% resource utilisation).
  3. Infrastructure as code makes it easy to overprovision in the cloud. If machine creation can be scripted it can be reproduced. You can create 100 machines as easily as 1.
So, in 5 years’ time the 2% of energy currently being used in data centres could be 4% or higher. That’s a lot of power stations. And a lot of overprovisioning.

Why do we Overprovision in the Cloud?
Because we can.

DevOps has given us the power and it’s not like we have a lot of choice. Autoscaling is not magic - you can’t scale up in real time because it takes at least a few minutes to bring up a new VM. So, if you cannot predict the future (and if you can, your remuneration package must be  good) then to handle unexpected demand you have to overprovision and keep that extra capacity sitting around hot-but-idle. And then thank goodness you have that choice. In the old days we just fell over ;-)

Cloud+devops gives us the option of overprovisioning to avoid failure. So we do overprovision and that’s not a crazy judgement. It you fall over your company might fail. Overprovisioning is just money.

What does philosophy have to say? 
My favourite 18th century German philosopher Kant would say “what if everyone overprovisioned their infrastructure?”

The answer is higher energy usage in a world where energy generation is mostly CO2-producing fossil fuels. Hmm. Kant would say not ideal. It’s a shame to be sitting in a cloud of CO2, but it’s particularly galling if that’s just to keep data centre capacity idle.

What can we do?
Hope for AWS to save us! 

Or, alternatively....

VMs and cloud infrastructure don’t help with server utilisation. It feels like they should but the data suggest that in practise they don’t. They probably make it worse. However, there are lots of new technologies coming along that do help: containers, microscaling, orchestrators & serverless architectures (potentially). 

Just look at Google, they use all of these technologies to achieve server utilisation of around 70%, which is >5 times what the rest of us manage. If we were all achieving that then maybe devops wouldn't help destroy the world after all.