Freight Containers on a Ship

Backing up and restoring Docker containers

You should back up your data, properly! If you’re not, you’re playing a dangerous game with fate. Computers are pretty reliable, but they also go wrong, often. You should always backup your files, but backing up a containerized application isn’t quite as simple.

A container is 3 things:

  • Configuration
  • Volumes
  • Networking

The point of a container is it’s rather self-contained. You don’t need to back up the container image itself, as that’s defined by the docker run command, and you should be using sane versioning on that (not latest).

#Backup

A compose file may be made up of multiple containers, and perhaps even multiple distinct applications. When backing up applications, it’s important to back up any dependant containers too, such as databases.

#Configuration

A container isn’t just data, there’s configuration in there too. This configuration not only defines which container to run, but also environment variables, networking preferences and of course, mounts.

Exactly how your containers are configured is up to you, although on the small scale I suspect you, like me, are using docker-compose. If you’re not, and you’re using something like a manual docker run command, or a systemd service, you’re kinda doing it wrong. If you’re using things like Kubernetes, then there’s a little more to back up, but chances are you already know what you’re doing.

By having a backup of the container name and tag you used (e.g. postgres:12-alpine), you don’t need to back up the entire original container (in this case ~58MB), just your data, which generally lives in volumes.

Configuration is likely either a handful of files, or just a single one. This will either be super simple to back up, or you’re already storing it in version control - something I also highly recommend.

#Volumes

Volumes are the way you expose files on the host into a container. There are 2 main types: bind mounts and volume mounts. They work exactly the same, except volume mounts will live in /var/lib/docker/volumes, and bind mounts can live wherever you want. Personally I use, and highly recommend, bind mounts.

If you’re using bind mounts, you already know where your data is, so it’s easy to just copy that data off to wherever you’re doing backups. You’ll be able to see the paths nice and easily in your docker-compose.yml under volumes: for each service.

If you’re using volume mounts, the backup process is the same, but the hard part is finding the files. You can list the volume directories used by a container with docker inspect -f '{{ .Mounts }}' <container> (notice not docker-compose).

When backing up the files, the permissions are also important. If you copy your data but don’t get the permissions right, your container won’t start. rsync --archive will copy all the metadata for a file along with its content. You may need to run it as root, both so it can read all the files and properly set the ownership. Most other backup mechanisms will also work, but notably cp will not.

#Networking

Chances are, you’re only relying on the bridge network docker-composecreates between containers in the same docker-compose.yml. You don’t need to do anything fancy to back this up - in fact you don’t need to do anything at all! Unless you’ve modified it from outside the compose file, it’ll just come with you and reconfigure itself on next start.

If you’ve created a custom network, then backing that up is slightly harder. Unless you’ve got your own versioning system for this (Ansible or alike), you’ll need to reverse engineer what the network looks like and how to recreate it. Let that be a lesson for not version-controlling it.

If the containers you’re trying to back up are connected to others, via external_links or otherwise, you’ll also need to back those up too. Whether it be a shared database, monitoring services or reverse proxy.

#Restoring

A backup is only useful when it’s possible to restore it. You can have all the backups in the world, but if you have no idea how to restore them, or worse yet can’t restore them, they’re pretty pointless.

#Configuration

Your configuration files just need to be copied as-is into the correct place. Personally I store mine in /opt/<application>/docker-compose.yml.

With your configuration in place, you can now pull down the required containers with docker-compose pull. Assuming you’re tagging things properly, you’ll now have the exact same container downloaded, without needing to back it up yourself - awesome!

#Volumes

When copying your data into place, as with backing up the data, you’ll need to copy it into place ensuring the permissions remain in-tact. You’ll probably want to use the same tool you used to create the backup in the first place.

If you were previously using volume mounts rather than bind mounts, now might be the perfect time to change. It’s quite a pain to put files back into the correct volume mount location, but incredibly simple to just use bind mounts. A small modification to your configuration can make a huge difference to how simple backups become.

#Networking

If you just used the default bridge networks, you’re already set. docker-compose will create the bridge network as usual with all the required containers connected.

If you did have custom networks, now is the time to create them from whatever format you backed them up in. Whether it be Ansible playbooks, random bash scripts, or something else.

#Closing

In case you weren’t aware, backups are important! Backing up containers is both slightly different, and exactly the same as backing up regular data files. The trick is where to look.

Share this page

Similar content

View all →

Keeping your Docker containers up to date

2020-07-27
4 minutes

Last year, I switched all of my hosting from arbitrarily installed packages to Docker. This made installing and configuring incredibly simple, but updating a little less defined. Whilst Docker itself is updated through the system package manager (probably), the containers themselves aren’t. Docker container versions are known as “tags”, and…

None

Docker in LXC

Docker is a great containerization technology for running applications. It keeps multiple applications completely isolated from each other, only allowing connections exactly when you tell them to. But what if you’re on a hypervisor? You want your host OS to be as lean as possible (else it defeats the point),…