Tuesday, 21 June 2016

MicroBadger - helping you manage your containers

image5.JPG


Last week at HashiConfEU our own Anne Currie unveiled the MVP of MicroBadger, a tool we’ve been working on to make managing containers easier and safer. In this post we’ll explain the motivation behind MicroBadger; show why it is already gaining support from influential people in the container ecosystem, and of course, we’ll explain how to use it.

Why MicroBadger?

As the number of Docker images in your system increases certain questions become hard to answer.

What is in those images? How can I get back to the exact commit that created this image? How can I tell how many of the containers that I’m running come from an image containing a library with a known bug or security flaw? In fact, do I even know which of the container images I’ve built are running in my cluster?

In 2015 Sonatype released their ‘State of the Software Supply Chain which highlighted some key areas that were not receiving enough attention from the Docker community, the bottom line: we rarely have a good idea about what we are actually running. While this may be manageable for development workloads where developers have an intimate knowledge of what they are working on and the risk is lower, it becomes a ticking time bomb when scaled up for production.

The issue becomes especially prevalent issue at companies running microservice architectures with quick development cycles. They may have hundreds of different services being developed, with each service’s respective CI pipeline pushing out many new version per day. Bitnami captured this well in a recent TheNewStack article: the “...lifecycle and change and configuration management technologies and habits that evolved over 5-10 years of managing virtual infrastructure will need to be re-examined.

Gareth Rushgrove of Puppet Labs has been talking about this problem for some time too and uses the analogy of a real shipping container’s manifest which states exactly what is inside the container.

With Docker this manifest becomes image metadata and since Docker 1.6 there has been a perfectly good method for expressing such metadata; Docker labels. Unfortunately an informal survey conducted in late 2015 concluded that fewer than 20% of images contain any labels at all. With the remaining 80% it is very difficult to figure out exactly what you’re running when something goes wrong.

What does MicroBadger do right now?

Have you ever found an image on the Docker Hub and wondered what code it was built from? Or tried to locate the Docker image for a source code repo?

By labelling containers with the source code details, MicroBadger makes it easy to move with confidence between source code repository and image hub.



Get badges for your code & image. MicroBadger supplies badges for your GitHub Readme file and Docker Hub notes, so that anyone can easily jump from your source code to Docker image and vice versa.

There are three steps to getting your badges:

  1. Push the image to Docker Hub so that it’s publicly available
  2. Look up the image on MicroBadger

If your image is labelled correctly you’ll get markdown that you can then copy-and-paste to the GitHub readme and into the Docker Hub notes.

Some public images such as puppet/puppetserver and centos already include metadata, so they serve as an good basis for exploration.

With MicroBadger we would like to encourage the use of Docker labels by providing a simple way to inspect them. Of course it will require more than just Docker labels to understand a software supply chain, but we chose to start with labels and will be moving to other areas in later releases such as inspecting the Dockerfile and following library dependencies from there.

How can I get involved?

We would like you to label your images, use our badges and give us some feedback. Check out these examples where people are already using labels, badges and metadata:




Friday, 10 June 2016

Scott Guthrie demos Microscaling

Exciting to see Scott Guthrie (EVP of Microsoft's Cloud & Enterprise Group) showing our demo. Thanks @DataSic for grabbing such a good photo!

Monday, 6 June 2016

The Joy Of Organising. Container Image Labeling

Labelling is the new black

In Docker v1.6, Redhat contributed a way to add metadata to your Docker container images using labels. This was a great idea - a standard way to add stuff like notes or licensing conditions to an image. In fact, the only problem with labels is they aren’t used more (<20% of the time by some estimates). Gareth Rushgrove from Puppet has an interesting presentation on the concept

We decided we really needed to add labels to our Microscaling images so we did and here are our thought processes, the tools we wrote and our recommendations for how to do the “right thing”.

Schema and formatting

Labels are just free text key/value pairs, which is flexible but potentially messy -particularly as you can label an image or container multiple times (and containers inherit labels for their images)  and new keys will just overwrite any old keys with the same name.

  • Do namespace your keynames with the reverse DNS notation of your domain (e.g. com.mydomain.mykey) to avoid unplanned overwrites.
  • Don’t use com.docker.x, io.docker.x or org.dockerproject.x as the namespace in your keynames. Those prefixes are reserved.
  • Do use lowercase alphanumerics, . and - ([a-z0-9-.]) in keynames and start and end with an alphanumeric.
  • Don’t use consecutive dots and dashes in keynames.
  • Don’t add labels one at a time with individual label commands. Each label adds a new layer to the image so that’s inefficient. Try to add multiple labels in one call:
LABEL vendor=ACME\ Incorporated \
      com.example.is-beta= \
      com.example.is-production="" \
      com.example.version="0.0.1-beta" \
      com.example.release-date="2015-02-12"

These guidelines are not enforced, but we expect tools to come along to do that. 

Labels to Use

There has been much community discussion about default and orchestrator-specific labels (see references). We reviewed that and decided to add a very minimal initial set of labels to our own images:

"Labels": {
    "org.label-schema.build-date": "2016-06-03T19:17:49Z",
    "org.label-schema.docker.dockerfile": "/Dockerfile",
    "org.label-schema.license": "Apache-2.0",
    "org.label-schema.url": "https://microscaling.com",
    "org.label-schema.vcs-ref": "f03720f",
    "org.label-schema.vcs-type": "git",
    "org.label-schema.vcs-url": "git@github.com:microscaling/microscaling.git",
    "org.label-schema.vendor": "Microscaling Systems"
}

(Updated 14 June 2016: These example labels were updated in this blog post to use a proposed common namespace - see note below).

Adding Labels to your DockerFile

Although you can just hardcode all these values into your Dockerfile, that’s a bit manual (because you can’t have dynamic labels in a Dockerfile). We therefore built a makefile to automate passing dynamic labels into our Dockerfile, which is checked into our GitHub repo.

Licensing

The most controversial label in our opinion is license. Does an image have a single license? It might easily have multiple licences if you are building say a Ruby app on top of Alpine. We therefore expect the license label to potentially be multi-valued (comma separated), for example “Apache-2.0, GPL-3.0”

For consistency we used the approach that each license is a text string license Identifier as defined by SPDX http://spdx.org/licenses/

Go forth and label

It’s very easy to add your own labels to your images. Metadata is very useful  as a way to organise and track images, which can otherwise get pretty out of control!

References



Update 14 June 2016

After talking about these ideas with some folks across the container community, we're starting to work with them on a standardized label schema that we can all use in common. More details to follow very soon, but as a consequence of this we have started to use org.label-schema rather than our own namespace (com.microscaling) for some metadata, and we're updating this post to use that common namespace in the examples. If you'd like to get involved with the discussion of the standard label schema straight away, please let us know on @microscaling.