Importing OpenSSF Security Scorecard data into dependency-management-data

The OpenSSF Security Scorecard makes it possible to understand at-a-glance the health of an Open Source project, and whether there are any concerns with your Supply Chain security.

Since dependency-management-data v0.76.0, it has been possible to enrich the data in the database with data from the public API from Scorecards, via dmd db generate dependency-health, so you can get insight into data alongside other insights that dependency-management-data provides.

However, because the Scorecard project only looks at the 1 million most critical Open Source projects, your dependencies may not all be present.

Therefore, you may be wanting to manually fetch the data from the scorecard CLI to cover all your dependencies' repositories.

Since dependency-management-data v0.84.0, there is now a dmd import scorecard command which allows you to take the results from the scorecard CLI as-is, instead of relying upon the public data that dmd db generate dependency-health will use.

For instance, let's say that we want to process the following repositories for our dependencies:

scorecard --repo github.com/oapi-codegen/runtime --output runtime.json      --format json
scorecard --repo gitlab.com/fdroid/fdroidclient  --output fdroidclient.json --format json

Once these have succeeded, we can then import the data:

# if you've not already prepared the DB
dmd db init --db dmd.db
# NOTE quotes to prevent globbing
dmd db import scorecard --db dmd.db '*.json'

Note that this uses Ecosyste.ms' Packages API to look up dependency names for given repository URLs.

Once this is imported, we can then write queries to look up this data.

For instance we can write the following query to list all packages that have a total scorecard below 3:

select
  s.platform,
  s.organisation,
  s.repo,
  s.package_name,
  s.version,
  s.current_version,
  package_type as package_manager,
  -- as SBOMs don't make this available, default to an empty array
  '[]' as dep_types,
  -- as SBOMs don't make this available, default to an empty string
  '' as package_file_path,
  printf('%.2f', scorecard_score) as scorecard_score
from
  sboms s
  inner join dependency_health as h on s.package_name = h.package_name
  and s.package_type = h.package_manager
where
  scorecard_score < 3
union
select
  r.platform,
  r.organisation,
  r.repo,
  r.package_name,
  r.version,
  r.current_version,
  r.package_manager,
  r.dep_types,
  r.package_file_path,
  printf('%.2f', scorecard_score) as scorecard_score
from
  renovate r
  inner join dependency_health as h on r.package_name = h.package_name
  and r.package_manager = h.package_manager
where
  scorecard_score < 3
order by
  scorecard_score desc;

You can see the result of this query on the example data.