Setting up the Git repo to store dependency-management-data output
Once you've decided how to collect your data, you then have to decide how you're going to store it.
The recommended location for storing dependency-management-data is inside a Git repo, which allows you to connect your CI platform of choice to the data and perform builds on every push to the repo, hourly, or via some other logic.
It is recommended to call the repository dependency-management-data
.
Repo layout
A structure that works quite well is:
- have a README that explains what the repository is, some basic instructions how to get started and links out to further documentation, if appropriate
- this is not just effective because it can be used by folks landing at the repo, but it can also be bundled up with the SQLite database, so a consumer can have a copy of the data, and docs for how to use it
- create a top-level directory for each datasource you're using
- an
advisories.sql
, or named similarly, to allow defining your own custom advisories
Although (likely) not part of the repository, it's also worthwhile creating a wiki page, for instance in your source control platform of choice, which can contain common queries, and more in-depth getting started guides.
A structure that works quite well is as follows:
renovate/
github-jamietanna-jamietanna.json
...
sboms/
snyk-oapi-codegen-cyclone.json
...
README.md
advisories.sql
An example of this structure can be found in the example repo on GitLab.com.
Building the database
Depending on which CI platform you're using, you will have a different setup required to take the dependency data exports and produce the SQLite database.
GitHub Actions
For building the database from a set of dependencies, we can create i.e. .github/workflows/build-database.yml
:
name: Build Dependency Management Data database
on:
# alternatively, on a schedule
push:
branches:
- main
paths:
- .github/workflows/build-database.yml
- advisories.sql
# alternatively, other datasources
- renovate/*
# allow manual builds
workflow_dispatch: {}
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Go
uses: actions/setup-go@v4
with:
go-version: 'stable'
- name: Install DMD CLI
run: go install dmd.tanna.dev/cmd/dmd@latest
- name: Initialise database
run: dmd db init --db dmd.db
- name: Import dependencies
run: |
dmd import renovate --db dmd.db 'renovate/*' --no-progress
# and any others
- name: Fetch -contrib data
run: dmd contrib download
- name: Generate missing-data
run: dmd db generate missing-data --db dmd.db
- name: Generate advisories
run: dmd db generate advisories --db dmd.db
- name: Add custom organisation-specific advisories
run: sqlite3 dmd.db < advisories.sql
- name: Finalise the database
run: dmd db meta finalise --db dmd.db
- name: Upload artifact
uses: actions/upload-artifact@v3
with:
name: sqlite-db
path: |
dmd.db
README.md
# as appropriate, but can be rebuild as-and-when you need it
retention-days: 1
It's also recommended adding i.e. .github/workflows/test-advisories.yml
to validate that custom advisories work, and are applied to the right dependencies:
name: Validate advisories.sql
on:
push:
paths:
- .github/workflows/test-advisories.yml
- advisories.sql
workflow_dispatch: {}
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Go
uses: actions/setup-go@v4
with:
go-version: 'stable'
- name: Install DMD CLI
run: go install dmd.tanna.dev/cmd/dmd@latest
- name: Initialise database
run: dmd db init --db dmd.db
- name: Import dependencies
run: |
dmd import renovate --db dmd.db 'renovate/*' --no-progress
# and any others
- name: Add custom organisation-specific advisories
run: sqlite3 dmd.db < advisories.sql
- name: List advisories
run: dmd report advisories --db dmd.db
GitLab CI
On GitLab CI, we would have the following .gitlab-ci.yml
:
build-database:
image: golang:1.21-alpine
stage: build
before_script:
- go install dmd.tanna.dev/cmd/dmd@latest
script:
- dmd db init --db dmd.db
- dmd import renovate --db dmd.db 'renovate/*.json'
- dmd contrib download
- dmd --db dmd.db db generate missing-data
- dmd --db dmd.db db generate advisories
artifacts:
when: on_success
paths:
- dmd.db
expire_in: "1 day"
rules:
- changes:
# or other datasources
- renovate/*
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
An example of this can be found in the example repo on GitLab.com.
Collecting the data
This is left as an exercise to the reader, as this is very dependent on the tools you're using to collect the data.
GitLab CI
An example of doing this using renovate-graph
with GitLab CI can be found in the example repo on GitLab.com.