Included in the recent release of Docker 1.10 is a feature
destined to become more important with future releases: support for
user namespaces. At the moment, it’s not enabled in a fresh install,
and it still feels a little bleeding edge compared to more established
Docker features, but it does work and is worth getting to know.
I spent a little time getting familiar; by no means enough to claim
expertise, but enough to make it work. Hopefully the fact that
it’s new to me will make it easier for me to explain to others, since
I hit some obstacles on the way to getting it to work.
I’m assuming that if you’re reading this, you may have seen one of
the excellent pages on user namespaces in Docker. Very briefly,
the idea is to map the “root” user in the container to be some normal
(unprivileged) user on the host system. This allows us to prevent
containers from modifying files on the host, even with mapped volumes,
which allows closing other security holes that allow
containers to improperly obtain privileges on the host.
I started with a fresh install of Ubuntu Wily; however, despite this
being the latest, it doesn’t have a very new Docker in the default set
of packages. So we need to move on to using Docker’s own repository.
In my Ansible playbook, this looks like this:
- name: docker apt key
apt_key: keyserver=keyserver.ubuntu.com id=F76221572C52609D
- name: install docker repo
apt_repository: repo='deb http://apt.dockerproject.org/repo ubuntu-wily main'
- name: Install packages
apt: pkg=docker-engine state=installed update_cache=yes
Note that the package name changed recently from ‘docker.io’ to
With that done, and the Docker service started, we now have:
root@penguin64:~# docker -v
Docker version 1.10.1, build 9e83765
However, this installation does not have user namespaces enabled.
To enable it, we need to pass an argument to the Docker daemon.
Here’s the first place where there is a potential for confusion. On
Ubuntu, there is a file
/etc/default/docker with some content;
however, this file is not used now that Docker has switched over to
running services with systemd. Instead, the expected way to
handle it is to create a “drop-in”. Systemd takes configuration
files from /lib/systemd, but it also looks in /etc/systemd for
files that override the defaults. This is a nice feature in that
it avoids the issue of having a package manager not be able to
update a file because it’s been customized.
The convention with systemd is to create an override directory for
each service. Since the Docker configuration file lives in
/lib/systemd/system/docker.service, this means a directory called
*.conf files in this
directory will override anything in the default configuration file.
title: "mkdir -p /etc/systemd/system/docker.service.d"
title: "cat >/etc/systemd/system/docker.service.d/userns.conf <<EOD"
ExecStart=/usr/bin/docker daemon -H fd:// --userns-remap=default
ExecStart= clears out the default value, since systemd
supports multiple processes in a single service for some service
types. The second replaces the default value with the command we
want. Getting that command right was itself a little painful, since
docker daemon --help in 1.10 isn’t terribly verbose when it comes to
identifying what kind of parameter is expected for
(That documentation issue has been fixed in latest master.)
Of course, there are other options besides default, but the default
worked for my purposes; it remaps into the ’nobody’ user on the
With this file in place, we need to reload systemd, then docker:
title: "systemctl daemon-reload"
title: "systemctl restart docker"
Here’s where I hit the second obstacle. I tried running a Docker image,
only to find out that I had no images. When switching to a separate
namespace, Docker creates a directory under
vagrant@penguin64:~$ sudo ls -l /var/lib/docker
drwx------ 9 296608 296608 4096 Feb 23 00:59 296608.296608
drwx------ 2 root root 4096 Feb 12 21:09 containers
drwx------ 5 root root 4096 Feb 23 00:42 devicemapper
drwx------ 3 root root 4096 Feb 12 21:09 image
drwxr-x--- 3 root root 4096 Feb 12 21:09 network
drwx------ 2 root root 4096 Feb 23 00:43 tmp
drwx------ 2 root root 4096 Feb 12 21:09 trust
drwx------ 2 root root 4096 Feb 12 21:09 volumes
No big deal, just had to pull the image I wanted again.
Finally, we can get down to starting a container and seeing
the effect of namespaces:
vagrant@penguin64:~$ docker run -it --rm centos /bin/bash
[root@65fd7566b552 /]# whoami
So inside the container, it still thinks of itself as root.
But root inside the container is not root on the host system:
vagrant@penguin64:~$ docker run -it --rm -v /opt:/opt centos /bin/bash
[root@690e37988416 /]# ls -ld /opt
drwxr-xr-x 2 65534 65534 4096 Nov 6 21:38 /opt
[root@690e37988416 /]# touch /opt/file1
touch: cannot touch '/opt/file1': Permission denied
And this means that it is no longer possible to use a SUID trick
to root the host.
Right now, this feature is limited so that all containers on a host
share the same namespace. On the roadmap is supporting per-container
namespaces, allowing finer control over what each container can access
on the host.