Determining the Docker registries, namespaces and images you most depend on
How news of the Docker Free Tier being sunset in March 2023 led to organisations wanting to understand their dependence on different namespaces or images on the public Docker Hub.
(Note: adapted from the blog post Working out which Docker namespaces and images you most depend on)
Related read: Case study: Determining how the Docker Free Tier sunset affects you.
Context
Similar to the situation noted in Case study: Determining how the Docker Free Tier sunset affects you, every company I've worked at has at one point wondered "who's using Docker images that aren't internally hosted?"
It can be useful to understand where you've got uses of Docker images from external sources - for instance Amazon's public Elastic Container Registry (ECR) or produced by GitHub or GitLab repositories and stored on their respective container registries - or through various internal container registries.
Understanding whether there are any namespaces you heavily use - for instance if internal.registry/java
is now deprecated, and you want to move folks to internal.registry/jvm
- can be convenient to know.
It can also be of note to see if there are Docker images that are heavily depended on, especially if they're not internally managed and could be a good opportunity to build an internally built alternative.
Additionally, as part of these checks, you can discover if there are uses of non-approved images, for instance if there is an (unenforced) requirement for production services to use internally-hosted-or-proxied images.
Problem
- Which registries are being used for Docker images?
- Which namespaces are being used for Docker images?
- Which images are being used for Docker images?
Data
Let's say that we have the following data in the renovate
table:
platform | organisation | repo | package_name | version | current_version | package_manager | package_file_path | datasource | dep_types |
---|---|---|---|---|---|---|---|---|---|
gitlab | technottingham | Hack24-API | mongo | 3.4.3 | 3.4.3 | docker-compose | docker-compose.yml | docker | [] |
gitlab | jamietanna | annadodson | monachus/hugo | gitlabci | .gitlab-ci.yml | docker | ["image"] | ||
github | co-cddo | api-catalogue | ruby | 3.3.0-alpine | 3.3.0-alpine | dockerfile | Dockerfile | docker | ["final"] |
github | elastic | beats | busybox | docker-compose | .ci/jobs/docker-compose.yml | docker | [] | ||
github | incident-io | catalog-importer | alpine | 20230329 | 20230329 | dockerfile | Dockerfile | docker | ["final"] |
github | thechangelog | changelog.com | ghcr.io/thechangelog/changelog-runtime | elixir-v1.14.5-erlang-v26.2-nodejs-v20.10.0 | docker-compose | .devcontainer/docker-compose.yml | docker | [] | |
github | cloud-custodian | cloud-custodian | cloudcustodian/c7n | latest | helm-values | tools/ops/azure/container-host/chart/values.yaml | docker | [] | |
github | hashicorp | consul | docker.mirror.hashicorp.services/alpine | 3.18 | 3.18 | dockerfile | Dockerfile | docker | ["stage"] |
gitlab | jamietanna | content-negotiation | openjdk | 11 | 11 | gitlabci | .gitlab-ci.yml | docker | ["image"] |
gitlab | jamietanna | content-negotiation-go | golang | 1.18 | 1.18 | gitlabci | .gitlab-ci.yml | docker | ["image"] |
gitlab | jamietanna | cucumber-reporting-plugin | openjdk | 11 | 11 | gitlabci | .gitlab-ci.yml | docker | ["image"] |
github | wiremock | wiremock-graphql-extension | maven | 3.6.3-jdk-11-slim | 3.6.3-jdk-11-slim | dockerfile | wiremock-graphql-extension/Dockerfile | docker | ["stage"] |
(Note: this is a subset of the available data)
Query
The dmd
CLI has an inbuilt query that produced the following output:
$ dmd report mostPopularDockerImages --db dmd.db
Renovate
+----------------------------------+-----+
| REGISTRY | # |
+----------------------------------+-----+
| docker.io | 651 |
| ghcr.io | 24 |
| docker.mirror.hashicorp.services | 20 |
| gcr.io | 19 |
| docker.elastic.co | 17 |
| public.ecr.aws | 12 |
| registry1.dsop.io | 11 |
| mcr.microsoft.com | 8 |
| registry.gitlab.com | 8 |
| registry.access.redhat.com | 3 |
| quay.io | 3 |
| registry1.dso.mil | 2 |
+----------------------------------+-----+
+----------------------------------+-----+
| NAMESPACE | # |
+----------------------------------+-----+
| library | 499 |
| ghcr.io/gravitational | 18 |
| dockersamples | 12 |
| gcr.io/distroless | 11 |
| public.ecr.aws/gravitational | 10 |
| registry1.dsop.io/redhat/ubi | 10 |
| docker.mirror.hashicorp.services | 10 |
| wiremock | 8 |
| docker | 8 |
| docker.elastic.co/elasticsearch | 7 |
| cimg | 6 |
| hashicorpdev | 6 |
+----------------------------------+-----+
+---------+----+
| IMAGE | # |
+---------+----+
| alpine | 57 |
| golang | 53 |
| node | 40 |
| docker | 38 |
| python | 25 |
| nginx | 24 |
| ruby | 23 |
| debian | 22 |
| ubuntu | 22 |
| redis | 20 |
| openjdk | 18 |
| busybox | 14 |
+---------+----+
Note that this isn't straightforward to do with an SQL statement on its own.