Schemas

Dependency Management Data produces various types of data into the underlying SQLite database.

Below you can find the schemas in the database.

For more details of how to query the database more effectively, check out Understanding the data model.

This schema may not match yours

The schemas shown here are based on the latest changes that have been committed to the project, which won't necessarily match the version in your database. It's worth consulting your own database schema, for instance:


$ sqlite3 dmd.db .schema

internal/advisory/db/schema.sql

-- advisories contains the resulting Advisories
-- (https://dmd.tanna.dev/concepts/advisory/) and Policy Violations
-- (https://dmd.tanna.dev/concepts/policy/) that have been discovered by
-- dependency-management-data when running `dmd db generate advisories` and
-- `dmd db generate policy-violations` respectively.

-- The intent of this table is to make it easier to query all relevant sources
-- of advisories data, without you as the user needing to understand the
-- various sources of data in the utility tables that exist, and perform a long
-- `JOIN`.
--
-- It is still possible to directly query the different tables that feed into
-- this table, but is recommended that for most cases, you start using
-- this `advisories` table.
--
-- `advisories` also makes it possible to include more specific information
-- around the dependency that has the advisory, without requiring the complex
-- `JOIN`s across multiple tables.
--
-- TODO: restore SBOM support
-- https://gitlab.com/tanna.dev/dependency-management-data/-/issues/596
CREATE TABLE IF NOT EXISTS advisories (
  -- what platform hosts the source code that this Advisory was produced for?
  -- i.e. `github`, `gitlab`, `gitea`, etc
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#platform
  --
  -- Foreign keys:
  -- - `renovate.platform`
  --
  -- TODO: restore SBOM support
  -- https://gitlab.com/tanna.dev/dependency-management-data/-/issues/596
  platform TEXT NOT NULL,
  -- what organisation manages the source code that this Advisory was produced
  -- for? Can include `/` for nested organisations
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#organisation
  --
  -- Foreign keys:
  -- - `renovate.organisation`
  --
  -- TODO: restore SBOM support
  -- https://gitlab.com/tanna.dev/dependency-management-data/-/issues/596
  organisation TEXT NOT NULL,
  -- what repo manages the source code that this Advisory was produced
  -- for?
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#repo
  --
  -- Foreign keys:
  -- - `renovate.repo`
  --
  -- TODO: restore SBOM support
  -- https://gitlab.com/tanna.dev/dependency-management-data/-/issues/596
  repo TEXT NOT NULL,

  -- what package is this Advisory for?
  --
  -- Foreign keys:
  -- - `renovate.package_name`
  -- - `sboms.package_name`
  package_name TEXT NOT NULL,
  -- version indicates the version of `package_name` that this Advisory
  -- is for.
  --
  -- NOTE this could be a version constraint, such as any of:
  --
  --   <=1.3.4,>=1.3.0
  --   "~> 0.9"
  --   latest
  --   ^2.0.6
  --   =1.0.4
  --
  -- As well as a specific value, such as:
  --
  --   1.0.4
  --   10
  --   latest
  --
  -- This versioning will be implementation-specific for the `package_manager` in use.
  --
  -- Foreign keys:
  -- - `renovate.version`
  -- - `sboms.version`
  version TEXT NOT NULL,
  -- current_version defines the current version that this package's `version`
  -- resolves to.
  --
  -- If the `version` is an exact version number, such as `1.0.4`, then
  -- `current_version` will usually be the same value, `1.0.4`.
  --
  -- If the `version` is a version constraint, then this column MAY indicate
  -- the exact version that was resolved at the time of dependency analysis.
  --
  -- Foreign keys:
  -- - `renovate.current_version`
  -- - `sboms.current_version`
  current_version TEXT,
  -- package_manager indicates the package manager that the Advisory
  -- will correspond to.
  --
  -- Based on which datasource(s) (https://dmd.tanna.dev/concepts/datasource/)
  -- you are using, this will be a different value:
  --
  -- - for Renovate data, must exactly match `renovate.package_manager`.
  --   Note that there may be multiple `package_managers`, for instance `maven`
  --   and `gradle`, which would require two rows.
  -- - for Software Bill of Materials (SBOM) data, must exactly match `sboms.package_type`
  --
  -- If you are using multiple datasources, you will have one row per
  -- `package_manager` that this Advisory matches.
  --
  -- Foreign keys:
  -- - `renovate.package_manager`
  -- - `sboms.package_type`
  package_manager TEXT NOT NULL,
  -- package_file_path defines the path within `repo` that defines the
  -- `package_name` as a dependency. For example:
  --
  --   .github/workflows/build.yml
  --   go.mod
  --   build/Dockerfile
  --
  -- NOTE that this may be empty
  -- (https://gitlab.com/tanna.dev/dependency-management-data/-/issues/396)
  --
  -- Foreign keys:
  -- - `renovate.package_file_path`
  -- - `sboms` does not have this field
  package_file_path TEXT NOT NULL,
  -- dep_types defines the different dependency types that may be in use. This
  -- will always be a JSON array, with 0 or more string elements. For example:
  --
  --   []
  --   ["action"]
  --   ["dependencies","lockfile"]
  --   ["dependencies","missing-data"]
  --   ["lockfile","lockfile-yarn-pinning-^21.1.1"]
  --   ["engines"]
  --
  -- Based on which datasource(s) (https://dmd.tanna.dev/concepts/datasource/)
  -- you are using, this will have different values and meanings.
  --
  -- TODO Querying this column will be found documented in
  -- https://gitlab.com/tanna.dev/dependency-management-data/-/issues/288
  --
  -- NOTE that in the future these there will be a more consistent naming
  -- structure for these
  -- (https://gitlab.com/tanna.dev/dependency-management-data/-/issues/379)
  --
  -- Foreign keys:
  -- - `renovate.package_file_path`
  -- - `sboms` does not have this field
  dep_types TEXT NOT NULL,

  -- level defines the severity of the Advisory. This will be
  -- organisation-specific in terms of what you deem most critical, but an
  -- example of what this could look like is:
  --
  --   ERROR: "Use of AGPL-3.0 licensed dependencies anywhere is a high-severity"
  --   WARN:  "Using a dependency that hasn't been updated in 1 year should be avoided"
  level TEXT NOT NULL
    CHECK (
      level IN (
        'ERROR',
        'WARN'
      )
    ),
  -- advisory_type defines the type of Advisory
  -- (https://dmd.tanna.dev/concepts/advisory/) that this Advisory will
  -- flagged as.
  advisory_type TEXT NOT NULL
    CHECK (
      advisory_type IN (
        -- the dependency is deprecated, and should ideally be replaced
        'DEPRECATED',
        -- the dependency is no longer maintained
        'UNMAINTAINED',
        -- there is a security issue with this dependency
        'SECURITY',
        -- there is organisational policy that recommends awareness of the use
        -- of this dependency
        'POLICY',
        -- there is no other `advisory_type` that makes sense for this type. If
        -- you feel there should be, please raise an issue on the issue tracker
        -- (https://gitlab.com/tanna.dev/dependency-management-data/-/issues)
        'OTHER'
      )
    ),
  -- description is a human-readable explanation of why this advisory is being
  -- flagged. The contents will be shown verbatim to a user, and will not be
  -- interpreted as markup. This can be as long and detailed as you wish, and
  -- is recommended to include links to (internal) documentation around the
  -- finding, any known remediation actions, and communication channels to
  -- reach out to for information.
  description TEXT NOT NULL,

  -- supported_until describes the date that this dependency is (actively)
  -- supported until
  --
  -- NOTE: that this is only relevant for `UNMAINTAINED` or `DEPRECATED` advisories
  supported_until TEXT,
  -- eol_from describes the date that this dependency will be marked as End of
  -- Life, and will no longer be maintained from
  --
  -- NOTE: that this is only relevant for `UNMAINTAINED` or `DEPRECATED` advisories
  eol_from TEXT,

  UNIQUE (platform, organisation, repo, package_file_path, package_name, package_manager, dep_types, level, advisory_type, description) ON CONFLICT REPLACE
);

-- custom_advisories is a table that makes it possible to flag Advisories
-- (https://dmd.tanna.dev/concepts/advisory/) with packages, for instance to
-- indicate lack of maintainence upstream, security issues, etc.
--
-- These give you the ability to provide additional insight into packages that
-- your projects use.
--
-- This data can be sourced through the `dmd contrib` subcommand, for
-- community-provided advisories, or alternatively you can add your own by
-- directly inserting rows into this table. See
-- https://dmd.tanna.dev/cookbooks/custom-advisories/ for more details on how
-- to do so.
--
-- The custom_advisories table will be joined against any package data
-- (https://dmd.tanna.dev/concepts/datasource/#package-data), which may have
-- different definitions for some columns - please make sure you review each
-- column's documentation.
CREATE TABLE IF NOT EXISTS custom_advisories (
  -- package_pattern defines an exact package name, or a pattern that should
  -- match a package name, indicating which package the advisory is for. this
  -- can either be an exact match, such as:
  --   `dmd.tanna.dev`
  -- or it can include a `*` character to indicate a wildcard such as:
  --   `*/oapi-codegen`
  --   `@my-org/*'
  --   `*tanna*`
  --   `*tan*na*`
  --
  -- Foreign keys:
  -- - `renovate.package_name`
  -- - `sboms.package_name`
  package_pattern TEXT NOT NULL,
  -- package_manager indicates the package manager that the given `package_pattern` should match.
  --
  -- Based on which datasource(s) (https://dmd.tanna.dev/concepts/datasource/)
  -- you are using, this will be a different value:
  -- - for Renovate data, must exactly match `renovate.package_manager`.
  --   Note that there may be multiple `package_managers`, for instance `maven`
  --   and `gradle`, which would require two rows.
  -- - for Software Bill of Materials (SBOM) data, must exactly match
  -- `sboms.package_type`
  --
  -- If you are using multiple datasources, you will need to have one row per `package_manager`.
  --
  -- Foreign keys:
  -- - `renovate.package_manager`
  -- - `sboms.package_type`
  package_manager TEXT NOT NULL,
  -- version defines version(s) that this advisory relates to.
  -- If NULL, any instances of this package (at any version) will be flagged.
  -- If non-NULL, the `version_match_strategy` will be taken into account.
  --
  -- Foreign keys:
  -- - `renovate.current_version`
  -- - `sboms.current_version`
  version TEXT,
  -- version_match_strategy defines how the advisory's `version` column gets
  -- matched against the given dependency.
  -- If NULL, `version_match_strategy` will be treated as if it were set to `ANY`
  -- If non-NULL, will perform the corresponding match type, which are documented below.
  --
  -- NOTE: that this is performed with a lexicographical match, which is NOT
  -- likely to be what you are expecting to perform version constraint matching
  -- For example:
  --   Performing a `GREATER_THAN` v1.10 would result in:
  --     v1.2.3
  --     v1.20.3
  --   Which does NOT match the expectation that you would only see `v1.20.3`.
  --
  -- If you would like more control over advisory data, and to perform true
  -- version number calculations, it's worth writing Policies using Open Policy
  -- Agent (https://dmd.tanna.dev/cookbooks/custom-advisories-opa/)
  --
  -- If `version` is NULL, this column is ignored.
  version_match_strategy TEXT
    CHECK (
      version_match_strategy IN (
        -- any packages that match `package_pattern` and `package_manager` will
        -- be classed as an Advisory
        'ANY',
        -- any packages that match `package_pattern` and `package_manager`, and
        -- has a `current_version` which is exactly equal to `version` will be
        -- classed as an Advisory
        'EQUALS',
        -- any packages that match `package_pattern` and `package_manager`, and
        -- has a `version < current_version` (lexicographically compared) will
        -- be classed as an Advisory
        'LESS_THAN',
        -- any packages that match `package_pattern` and `package_manager`, and
        -- has a `version <= current_version` (lexicographically compared) will
        -- be classed as an Advisory
        'LESS_EQUAL',
        -- any packages that match `package_pattern` and `package_manager`, and
        -- has a `version > current_version` (lexicographically compared) will
        -- be classed as an Advisory
        'GREATER_THAN',
        -- any packages that match `package_pattern` and `package_manager`, and
        -- has a `version >= current_version` (lexicographically compared) will
        -- be classed as an Advisory
        'GREATER_EQUAL'
      )
    ),
  -- level defines the severity of the Advisory. This will be
  -- organisation-specific in terms of what you deem most critical, but an
  -- example of what this could look like is:
  --
  --   ERROR: "Use of AGPL-3.0 licensed dependencies anywhere is a high-severity"
  --   WARN:  "Using a dependency that hasn't been updated in 1 year should be avoided"
  level TEXT NOT NULL
    DEFAULT 'ERROR'
    CHECK (
      level IN (
        'ERROR',
        'WARN'
      )
    ),
  -- advisory_type defines the type of Advisory
  -- (https://dmd.tanna.dev/concepts/advisory/) that this dependency will
  -- flagged as.
  --
  -- NOTE that this field is an exact match for the
  -- `custom_advisories.advisory_type` column
  advisory_type TEXT NOT NULL
    CHECK (
      advisory_type IN (
        -- the dependency is deprecated, and should ideally be replaced
        'DEPRECATED',
        -- the dependency is no longer maintained
        'UNMAINTAINED',
        -- there is a security issue with this dependency
        'SECURITY',
        -- there is organisational policy that recommends awareness of the use
        -- of this dependency
        'POLICY',
        -- there is no other `advisory_type` that makes sense for this type. If
        -- you feel there should be, please raise an issue on the issue tracker
        -- (https://gitlab.com/tanna.dev/dependency-management-data/-/issues)
        'OTHER'
      )
    ),
  -- description is a human-readable explanation of why this advisory is
  -- being flagged. The contents will be shown verbatim to a user, and will
  -- not be interpreted as markup. This can be as long and detailed as you
  -- wish, and is recommended to include links to (internal) documentation
  -- around the finding, any known remediation actions, and communication
  -- channels to reach out to for information.
  description TEXT NOT NULL,

  UNIQUE (package_pattern, package_manager, version, version_match_strategy, advisory_type, level, description) ON CONFLICT REPLACE
);

internal/componentmetadata/db/schema.sql

-- component_metadata allows tracking custom metadata around Components
-- (https://dmd.tanna.dev/concept/component/) in your organisation.
--
-- As Components are not necessarily a one-to-one mapping of a repository, and
-- may not have a Repo Key associated (https://dmd.tanna.dev/concept/repo-key/),
-- we use a separate table to provide metadata for them.
--
-- This is effectively the `repository_metadata` and `owners` table combined,
-- with the ability to better suit monorepo setups.
--
-- This metadata can provide some additional context to your repositories for
-- instance converting:
--
-- > {component_name} has 8 outstanding policy violations, and is running 3
--   end-of-life pieces of software
--
-- But with the metadata inside `component_metadata`, we can instead reframe
-- this with:
--
-- > The public-facing Component {component_name}, which is deployed to highly
--   sensitive PCI environments has 8 outstanding policy violations, and is
--   running 3 end-of-life pieces of software
--
-- This additional metadata can make it possible to better understand and
-- action data within dependency-management-data.
--
-- As well as first-class metadata, it's possible to track custom
-- organisation-specific metadata that can be queried via SQLite's JSON
-- functions.
--
-- The `component_metadata` table is expected to be JOIN'd with
-- `repository_metadata` for any repository specific information, if a Repo Key
-- is known for the Component.
CREATE TABLE IF NOT EXISTS component_metadata (
  -- component_name is the name of the Component that this metadata describes.
  --
  -- Note that `component_name` is expected to be unique.
  --
  -- See also: https://dmd.tanna.dev/concept/component/
  --
  -- Foreign key: `sboms.component_name`
  component_name TEXT NOT NULL,

  -- parent_component_name indicates the `component_name` of the parent
  -- Component to this Component.
  --
  -- This is generally used for the case where there is a monorepo of Components.
  --
  -- For instance, if you had a repository called `customer-help-service` with
  -- a structure such as:
  --
  --   /
  --   /.git/
  --   /apps/
  --   /apps/kafka-consumer/
  --   /apps/web/
  --   /apps/internal-web/
  --
  -- Then this would produce the following Components:
  --
  -- `customer-help-service`                (with a `path` of `/`)
  -- `customer-help-service-kafka-consumer` (with a `path` of `/apps/kafka-consumer/`)
  -- `customer-help-service-web`            (with a `path` of `/apps/web/`)
  -- `customer-help-service-internal-web`   (with a `path` of `/apps/internal-web/`)
  --
  -- Note that this does not currently introduce this as a foreign key (https://gitlab.com/tanna.dev/dependency-management-data/-/issues/106) but is intended to be treated as such
  --
  -- Foreign key: `component_metadata.component_name`
  parent_component_name TEXT,

  -- what platform hosts the source code for this Component? i.e. `github`,
  -- `gitlab`, `gitea`, etc
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#platform
  platform TEXT,
  -- what organisation manages the source code for this Component? Can include
  -- `/` for nested organisations
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#organisation
  organisation TEXT,
  -- what is the repo name for this Component?
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#repo
  repo TEXT,

  -- Where is the Component located within the source repository?
  --
  -- Examples:
  --
  -- NULL / `/`             - the root of the repository is this component
  -- `apps/kafka-processor` - a sub-path within a repository
  path TEXT,

  -- vendor denotes the vendor (i.e. third party) that this Component was
  -- produced by.
  --
  -- This makes sense when the component is a piece of software that
  -- is not developed by your organisation, for instance a product that a
  -- vendor provides you as-is to self-host, or provided as a
  -- Software-As-A-Service.
  --
  -- Must be provided alongside `product`.
  vendor TEXT,

  -- product denotes the piece of software, built by `vendor`, that this
  -- Component corresponds to.
  --
  -- This makes sense when the component is a piece of software that
  -- is not developed by your organisation, for instance a product that a
  -- vendor provides you as-is to self-host, or provided as a
  -- Software-As-A-Service.
  --
  -- Must be provided alongside `vendor`.
  product TEXT,

  -- product_version denotes the version of the Component's `product` this
  -- metadata corresponds to.
  --
  -- This makes sense when the component is a piece of software that
  -- is not developed by your organisation, for instance a product that a
  -- vendor provides you as-is to self-host, or provided as a
  -- Software-As-A-Service.
  --
  -- Must be provided alongside `vendor` and `product`.
  product_version TEXT,

  -- owner is a free-form identifier for who owns the Component. This could be
  -- an email address, a team name, Slack channel name, etc, but should ideally
  -- be clear from this column who should be contacted about any queries about
  -- the Component
  owner TEXT,
  -- notes allows adding additional, optional, context around the ownership,
  -- for instance a link to a Slack channel, Confluence page, internal Service
  -- Catalog, etc. The contents will be shown verbatim to a user, and will not
  -- be interpreted as markup.
  owner_notes TEXT,

  -- component_type is a free-form field to create enum-style data, for
  -- instance `LIBRARY` or `SERVICE`, or `EXAMPLE_CODE`.
  --
  -- This may track with your Developer Portal's own definition of a
  -- Component's type.
  --
  -- For additional usage, you can use the `component_usage` field, or for
  -- further key-value data, use the`additional_metadata` field
  component_type text,

  -- component_usage is a free-form field to note additional information
  -- around the repository's usage, which is organisation-specific.
  --
  -- For instance, this may be enum-style data, a space-separated list of
  -- enum-style data, or a long human-readable description.
  --
  -- For additional usage, you can use the `additional_metadata` field
  component_usage text,

  -- description is a textual description of the repo for more context, which
  -- can include links out to other systems i.e. a Service Catalog. The
  -- contents will be shown verbatim to a user, and will not be interpreted as
  -- markup
  description TEXT,

  -- additional_metadata is a JSON object of additional key-value data that can
  -- be used to provide custom organisation-specific configuration, and augment
  -- any queries for data with information around this additional metadata for
  -- instance:
  --
  -- - `last_commit_date` - the last commit date to the project
  -- - `pci_environment` - the PCI environment the application is deployed to
  -- - `customer_type` - i.e. whether it's used for government, financial
  --   users, etc
  --
  -- NOTE this must be a JSON object of key-value strings
  additional_metadata TEXT,

  CHECK
  (
    (
      platform is null
      and
      organisation is null
      and
      repo is null
    )
    OR
    (
      platform <> ''
      and
      organisation <> ''
      and
      repo <> ''
    )
  ),

  CHECK
  (
    -- ensure that `vendor` and `product` are set
    (
      (
        vendor is null
        and
        product is null
      )
      AND
      (
        vendor <> ''
        and
        product <> ''
      )
    )
    OR
    -- ensure that the `product_version` is only set when `vendor` and `product` are
    (
      (
        vendor is not null
        and
        vendor <> ''
        and
        product is not null
        and
        product <> ''
        and
        product_version <> ''
      )
    )
  ),

  UNIQUE (component_name) ON CONFLICT REPLACE
);

internal/datasources/awselasticache/db/schema.sql

CREATE TABLE IF NOT EXISTS aws_elasticache_datastores (
  account_id TEXT NOT NULL,
  region TEXT NOT NULL,
  arn TEXT NOT NULL,
  name TEXT NOT NULL,
  engine TEXT NOT NULL,
  engine_version TEXT NOT NULL,

  -- tags is a JSON object
  tags TEXT NOT NULL,

  UNIQUE (account_id, region, arn) ON CONFLICT REPLACE
);

CREATE TABLE IF NOT EXISTS aws_elasticache_datastore_engines (
  engine TEXT NOT NULL,
  engine_version TEXT NOT NULL,
  deprecation TEXT NOT NULL,

  UNIQUE (engine, engine_version) ON CONFLICT REPLACE
);

internal/datasources/awslambda/db/schema.sql

CREATE TABLE IF NOT EXISTS aws_lambda_functions (
  account_id TEXT NOT NULL,
  region TEXT NOT NULL,
  arn TEXT NOT NULL,
  name TEXT NOT NULL,
  runtime TEXT NOT NULL,
  last_modified TEXT,

  -- tags is a JSON object
  tags TEXT NOT NULL,

  UNIQUE (account_id, region, arn) ON CONFLICT REPLACE
);

CREATE TABLE IF NOT EXISTS aws_lambda_function_runtimes (
  runtime TEXT NOT NULL,
	-- deprecation marks the "Phase 1" deprecations as noted in https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html:
	-- > Phase 1 - Lambda no longer applies security patches or other updates to
	-- > the runtime. You can no longer **create** functions that use the runtime, but
	-- > you can continue to update existing functions. This includes updating the
	-- > runtime version, and rolling back to the previous runtime version. Note
	-- > that functions that use a deprecated runtime are no longer eligible for
	-- > technical support
	deprecation TEXT,
  -- end_of_life marks the "Phase 2" deprecations as noted in https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html
	-- > Phase 2 - you can no longer **create or update** functions that use the
	-- > runtime. To update a function, you need to migrate it to a supported
	-- > runtime version. After you migrate the function to a supported runtime
	-- > version, you cannot rollback the function to the previous runtime. Phase 2
	-- > starts at least 30 days after the start of Phase 1
  end_of_life TEXT,

  UNIQUE (runtime) ON CONFLICT REPLACE
);

internal/datasources/awsrds/db/schema.sql

CREATE TABLE IF NOT EXISTS aws_rds_databases (
  account_id TEXT NOT NULL,
  region TEXT NOT NULL,
  arn TEXT NOT NULL,
  name TEXT NOT NULL,
  engine TEXT NOT NULL,
  engine_version TEXT NOT NULL,

  -- tags is a JSON object
  tags TEXT NOT NULL,

  UNIQUE (account_id, region, arn) ON CONFLICT REPLACE
);

CREATE TABLE IF NOT EXISTS aws_rds_databases_engines (
  engine TEXT NOT NULL,
  engine_version TEXT NOT NULL,
  deprecation TEXT NOT NULL,

  UNIQUE (engine, engine_version) ON CONFLICT REPLACE
);

internal/datasources/renovate/db/schema.sql

-- renovate contains all dependencies that Renovate has been able to discover
-- about a given repository.
--
-- This may be retrieved through `renovate-graph` or via steps described in
-- https://gitlab.com/tanna.dev/dependency-management-data/-/issues/370
--
-- This is one of the first-class package Datasources
-- (https://dmd.tanna.dev/concepts/datasource/) that dependency-management-data
-- supports.
CREATE TABLE IF NOT EXISTS renovate (
  -- what platform hosts the source code that this Renovate data was produced
  -- for? i.e. `github`, `gitlab`, `gitea`, etc
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#platform
  platform TEXT NOT NULL,
  -- what organisation manages the source code that this Renovate data was
  -- produced for? Can include `/` for nested organisations
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#organisation
  organisation TEXT NOT NULL,
  -- what repo manages the source code that this Renovate data was produced
  -- for?
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#repo
  repo TEXT NOT NULL,

  -- what package is this?
  package_name TEXT NOT NULL,
  -- version indicates the version of this dependency.
  --
  -- NOTE this could be a version constraint, such as any of:
  --
  --   <=1.3.4,>=1.3.0
  --   "~> 0.9"
  --   latest
  --   ^2.0.6
  --   =1.0.4
  --
  -- As well as a specific value, such as:
  --
  --   1.0.4
  --   10
  --   latest
  --   sha256:b57a849e24f2bb2a62eb438768b3f2196b85d987fb26479a215e96c8ec4a82a9
  --   8e5e7e5ab8b370d6c329ec480221332ada57f0ab
  --
  -- This versioning will be implementation-specific for the `package_manager` in use.
  version TEXT NOT NULL,
  -- current_version defines the current version that this package's `version`
  -- resolves to.
  --
  -- If the `version` is an exact version number, such as `1.0.4`, then
  -- `current_version` will usually be the same value, `1.0.4`.
  --
  -- If the `version` is a version constraint, then this column MAY indicate
  -- the exact version that was resolved at the time of dependency analysis.
  --
  -- If the `version` is a digest then this column will be the same value.
  --
  -- The `current_version` is derived from:
  -- - the `locked_version` or `fixed_version`, if there is a lockfile or some
  --   pinning set
  -- - the `current_version` if present, and can be recalculated each time the
  --   data is generated
  current_version TEXT,

  -- digest is the digest that the given version is pinned to, for instance a
  -- Git commit hash or a container image's pinned digest.
  --
  -- Example values:
  --
  --   sha256:b57a849e24f2bb2a62eb438768b3f2196b85d987fb26479a215e96c8ec4a82a9
  --   8e5e7e5ab8b370d6c329ec480221332ada57f0ab
  digest TEXT,

  -- package_manager indicates the package manager that the dependency is from,
  -- which corresponds to Renovate's managers
  -- (https://docs.renovatebot.com/modules/manager/).
  --
  -- For example:
  --
  --   gomod
  --   pip_setup
  --   pip_requirements
  --   maven-wrapper
  --   html
  --   maven
  --   gradle
  package_manager TEXT NOT NULL,
  -- package_file_path defines the path within `repo` that defines the
  -- `package_name` as a dependency. For example:
  --
  --   .github/workflows/build.yml
  --   go.mod
  --   build/Dockerfile
  package_file_path TEXT NOT NULL,

  -- datasource indicates the datasource that the dependency has been sourced
  -- from, which corresponds to Renovate's datasources
  -- (https://docs.renovatebot.com/modules/datasource/).
  --
  -- For example:
  --
  --   go
  --   github-tags
  --   gitlab-tags
  --   maven
  --   gradle-version
  datasource TEXT NOT NULL,
  -- dep_types defines the different dependency types that may be in use. This
  -- will always be a JSON array, with 0 or more string elements. For example:
  --
  --   []
  --   ["action"]
  --   ["dependencies","lockfile"]
  --   ["dependencies","missing-data"]
  --   ["lockfile","lockfile-yarn-pinning-^21.1.1"]
  --   ["engines"]
  --
  -- Based on the Renovate version you use, and which
  -- `package_manager`/`datasource` this is for, these will have differing
  -- levels of data available.
  --
  -- TODO Querying this column will be found documented in
  -- https://gitlab.com/tanna.dev/dependency-management-data/-/issues/288
  --
  -- NOTE that in the future these there will be a more consistent naming
  -- structure for these
  -- (https://gitlab.com/tanna.dev/dependency-management-data/-/issues/379)
  dep_types TEXT NOT NULL,

  -- package_type most commonly relates to the "Type" field of a Package URL
  -- (as defined by https://github.com/package-url/purl-spec), which may be a
  -- package ecosystem or package manager type
  --
  -- NOTE this is derived by dependency-management-data, and will not always be
  -- correct. Please raise an issue if you encounter any `package_type`s that
  -- do not seem correct.
  package_type TEXT NOT NULL,

  -- package_url is the Package URL (as defined by
  -- https://github.com/package-url/purl-spec), for the given dependency
  --
  -- NOTE this is derived by dependency-management-data, and will not always be
  -- correct. Please raise an issue if you encounter any `package_type`s that
  -- do not seem correct.
  package_url TEXT NOT NULL,

  UNIQUE (platform, organisation, repo, package_file_path, package_name, package_manager, dep_types) ON CONFLICT REPLACE
);

-- renovate_updates lists the pending updates that a given dependency has, if
-- Renovate has discovered any.
--
-- A given dependency can have multiple rows, one per `update_type` that may be
-- available for a given package.
CREATE TABLE IF NOT EXISTS renovate_updates (
  -- what platform hosts the source code that this Renovate data was produced
  -- for? i.e. `github`, `gitlab`, `gitea`, etc
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#platform
  platform TEXT NOT NULL,
  -- what organisation manages the source code that this Renovate data was
  -- produced for? Can include `/` for nested organisations
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#organisation
  organisation TEXT NOT NULL,
  -- what repo manages the source code that this Renovate data was produced
  -- for?
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#repo
  repo TEXT NOT NULL,

  package_name TEXT NOT NULL,
  -- this could be a version constraint, such as `<=1.3.4,>=1.3.0`
  version TEXT NOT NULL,
  -- current_version defines the current version that this package's `version`
  -- resolves to.
  --
  -- If the `version` is an exact version number, such as `1.0.4`, then
  -- `current_version` will usually be the same value, `1.0.4`.
  --
  -- If the `version` is a version constraint, then this column MAY indicate
  -- the exact version that was resolved at the time of dependency analysis.
  --
  -- If the `version` is a digest then this column will be the same value.
  --
  -- The `current_version` is derived from:
  -- - the `locked_version` or `fixed_version`, if there is a lockfile or some
  --   pinning set
  -- - the `current_version` if present, and can be recalculated each time the
  --   data is generated
  current_version TEXT,

  -- digest is the digest that the given version is pinned to, for instance a
  -- Git commit hash or a container image's pinned digest.
  --
  -- Example values:
  --
  --   sha256:b57a849e24f2bb2a62eb438768b3f2196b85d987fb26479a215e96c8ec4a82a9
  --   8e5e7e5ab8b370d6c329ec480221332ada57f0ab
  digest TEXT,

  -- package_manager indicates the package manager that the dependency is from,
  -- which corresponds to Renovate's managers (https://docs.renovatebot.com/modules/manager/).
  --
  -- For example:
  --
  --  gomod
  --  pip_setup
  --  pip_requirements
  --  maven-wrapper
  --  html
  --  maven
  --  gradle
  package_manager TEXT NOT NULL,
  -- package_file_path defines the path within `repo` that defines the
  -- `package_name` as a dependency. For example:
  --
  --   .github/workflows/build.yml
  --   go.mod
  --   build/Dockerfile
  package_file_path TEXT NOT NULL,

  -- package_manager indicates the package manager that the dependency is from,
  -- which corresponds to Renovate's datasources (https://docs.renovatebot.com/modules/datasource/).
  --
  -- For example:
  --
  --   go
  --   github-tags
  --   gitlab-tags
  --   maven
  --   gradle-version
  datasource TEXT NOT NULL,

  -- new_version is the exact version that Renovate has determined this package
  -- needs to be updated to, within the `update_type`.
  new_version TEXT NOT NULL,

  -- new_digest is the digest that Renovate has Renovate has determined this package
  -- needs to be updated to, within the `update_type`.
  --
  -- Example values:
  --
  --   sha256:b57a849e24f2bb2a62eb438768b3f2196b85d987fb26479a215e96c8ec4a82a9
  --   8e5e7e5ab8b370d6c329ec480221332ada57f0ab
  new_digest TEXT,
  -- update_type determines the type of update this given change is. For
  -- instance, this could be a `major` upgrade, or it could be migrating from a
  -- deprecated package to a known `replacement`.
  --
  -- Non-exhaustive list of possible values:
  --
  --  digest
  --  pin
  --  rollback
  --  patch
  --  minor
  --  major
  --  replacement
  --  pinDigest
  update_type TEXT NOT NULL,

  UNIQUE (platform, organisation, repo, package_file_path, package_name, package_manager, update_type) ON CONFLICT REPLACE
);

-- create an index to support querying directly on core package metadata
CREATE INDEX idx_renovate_package on renovate(package_name, version, current_version, package_manager);

internal/datasources/sbom/db/schema.sql

-- sboms contains all dependency data that an external tool has detected and
-- exported into a Software Bill of Materials (SBOM).
--
-- This is one of the first-class package Datasources
-- (https://dmd.tanna.dev/concepts/datasource/) that dependency-management-data
-- supports.
CREATE TABLE IF NOT EXISTS sboms (
  -- component_name is the name of the component that this SBOM describes, as
  -- can be determined from the metadata.
  --
  -- For instance:
  --
  -- - in SPDX v2.3 this would be `$.name`
  -- - in CycloneDX v1.4 this would be `$.metadata.component.name`
  --
  -- If no result can be found for the given SBOM, the basename of the filename
  -- will be used, for instance:
  --
  --   filename:        $HOME/Downloads/another-path/sample.spdx.json
  --   component_name:  sample.spdx.json
  --
  -- NOTE that the `component_name` is treated as the unique identifier for a
  -- given SBOM and is used to deduplicate findings.
  component_name TEXT NOT NULL,

  -- what package is this?
  package_name TEXT NOT NULL,
  -- version indicates the version of this dependency.
  --
  -- NOTE this could be a version constraint, such as any of:
  --
  --   <=1.3.4,>=1.3.0
  --   "~> 0.9"
  --   latest
  --   ^2.0.6
  --   =1.0.4
  --                   (NULL)
  --
  -- As well as a specific value, such as:
  --
  --   1.0.4
  --   10
  --   latest
  --
  -- This versioning will be implementation-specific for the `package_manager`
  -- in use.
  --
  -- NOTE that due to the quality of the tool producing the SBOM, this field
  -- may be NULL.
  version TEXT,
  -- current_version defines the current version that this package's `version`
  -- resolves to.
  --
  -- If the `version` is an exact version number, such as `1.0.4`, then
  -- `current_version` will usually be the same value, `1.0.4`.
  --
  -- NOTE that due to the quality of the tool producing the SBOM, this field
  -- may be NULL, or this may not be an exact value, but a version constraint
  -- similar to `version`.
  current_version TEXT,

  -- package_type most commonly relates to the "Type" field of a Package URL
  -- (as defined by https://github.com/package-url/purl-spec), which may be a
  -- package ecosystem or package manager type
  package_type TEXT NOT NULL,

  -- package_url is the Package URL (as defined by
  -- https://github.com/package-url/purl-spec), for the given dependency
  package_url TEXT NOT NULL,

  UNIQUE (component_name, package_name, package_type, package_url) ON CONFLICT REPLACE
);

-- create an index to support querying directly on core package metadata
CREATE INDEX idx_sboms_package on sboms(package_name, version, current_version, package_type);

internal/dependencyhealth/db/schema.sql

-- dependency_health contains information about the health of given
-- dependencies.
--
-- This consumes data from different sources to augment the understanding of
-- dependencies in use, for instance giving an indication of whether they are
-- (well) maintained, have been recently released, or may have supply chain
-- hygiene issues.
--
-- Currently, this data is derived from:
--
-- - OpenSSF Security Scorecards (https://api.securityscorecards.dev/)
-- - Ecosystems (https://ecosyste.ms)
--
-- This data is a best-efforts attempt to provide this insight, and may be
-- stale at the time of fetching, via `dmd db generate dependency-health`.

-- The intent of this table is to provide additional insight into the
-- dependencies in use, either to feed into Policies
-- (https://dmd.tanna.dev/concepts/policy/) or purely for informational
-- purposes.
--
-- This is a utility table that is expected to be JOIN'd with the relevant
-- datasources you are using.
CREATE TABLE IF NOT EXISTS dependency_health (
  -- what package is this dependency health for?
  --
  -- Foreign keys:
  -- - `renovate.package_name`
  -- - `sboms.package_name`
  package_name TEXT NOT NULL,
  -- package_type most commonly relates to the "Type" field of a Package URL
  -- (as defined by https://github.com/package-url/purl-spec), which may be a
  -- package ecosystem or package manager type
  --
  -- This must be an exact match for the `package_type` in the given
  -- Datasource.
  --
  -- Foreign keys:
  -- - `renovate.package_type`
  -- - `sboms.package_type`
  package_type TEXT NOT NULL,

  --------------------------------------------------------------------------
  -------- OpenSSF Scorecards derived data

  -- scorecard_score is the overall calculated value of the OpenSSF Security
  -- Scorecards' checks for the given dependency.
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_score REAL
    CHECK (
      scorecard_score >= -1 AND scorecard_score <= 10
    ),

  -- scorecard_codereview is the value of the OpenSSF Security Scorecards'
  -- `Code-Review` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#code-review
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_codereview INTEGER
    CHECK (
      scorecard_codereview >= -1 AND scorecard_codereview <= 10
    ),

  -- scorecard_maintained is the value of the OpenSSF Security Scorecards'
  -- `Maintained` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#maintained
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_maintained INTEGER
    CHECK (
      scorecard_maintained >= -1 AND scorecard_maintained <= 10
    ),

  -- scorecard_ciibestpractices is the value of the OpenSSF Security
  -- Scorecards' `CII-Best-Practices` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#cii-best-practices
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_ciibestpractices INTEGER
    CHECK (
      scorecard_ciibestpractices >= -1 AND scorecard_ciibestpractices <= 10
    ),

  -- scorecard_license is the value of the OpenSSF Security Scorecards'
  -- `License` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#license
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_license INTEGER
    CHECK (
      scorecard_license >= -1 AND scorecard_license <= 10
    ),

  -- scorecard_dangerousworkflow is the value of the OpenSSF Security
  -- Scorecards' `Dangerous-Workflow` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#dangerous-workflow
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_dangerousworkflow INTEGER
    CHECK (
      scorecard_dangerousworkflow >= -1 AND scorecard_dangerousworkflow <= 10
    ),

  -- scorecard_packaging is the value of the OpenSSF Security Scorecards'
  -- `Packaging` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#packaging
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_packaging INTEGER
    CHECK (
      scorecard_packaging >= -1 AND scorecard_packaging <= 10
    ),

  -- scorecard_tokenpermissions is the value of the OpenSSF Security
  -- Scorecards' `Token-Permissions` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#token-permissions
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_tokenpermissions INTEGER
    CHECK (
      scorecard_tokenpermissions >= -1 AND scorecard_tokenpermissions <= 10
    ),

  -- scorecard_signedreleases is the value of the OpenSSF Security Scorecards'
  -- `Signed-Releases` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#signed-releases
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_signedreleases INTEGER
    CHECK (
      scorecard_signedreleases >= -1 AND scorecard_signedreleases <= 10
    ),

  -- scorecard_branchprotection is the value of the OpenSSF Security
  -- Scorecards' `Branch-Protection` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#branch-protection
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_branchprotection INTEGER
    CHECK (
      scorecard_branchprotection >= -1 AND scorecard_branchprotection <= 10
    ),

  -- scorecard_binaryartifacts is the value of the OpenSSF Security Scorecards'
  -- `Binary-Artifacts` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#binary-artifacts
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_binaryartifacts INTEGER
    CHECK (
      scorecard_binaryartifacts >= -1 AND scorecard_binaryartifacts <= 10
    ),

  -- scorecard_fuzzing is the value of the OpenSSF Security Scorecards'
  -- `Fuzzing` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#fuzzing
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_fuzzing INTEGER
    CHECK (
      scorecard_fuzzing >= -1 AND scorecard_fuzzing <= 10
    ),

  -- scorecard_securitypolicy is the value of the OpenSSF Security Scorecards'
  -- `Security-Policy` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#security-policy
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_securitypolicy INTEGER
    CHECK (
      scorecard_securitypolicy >= -1 AND scorecard_securitypolicy <= 10
    ),

  -- scorecard_sast is the value of the OpenSSF Security Scorecards' `SAST`
  -- check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#sast
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_sast INTEGER
    CHECK (
      scorecard_sast >= -1 AND scorecard_sast <= 10
    ),

  -- scorecard_vulnerabilities is the value of the OpenSSF Security Scorecards'
  -- `Vulnerabilities` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#vulnerabilities
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_vulnerabilities INTEGER
    CHECK (
      scorecard_vulnerabilities >= -1 AND scorecard_vulnerabilities <= 10
    ),

  -- scorecard_pinned-dependencies is the value of the OpenSSF Security
  -- Scorecards' `Pinned-Dependencies` check.
  --
  -- More details on the calculation can be found at
  -- https://github.com/ossf/scorecard/blob/main/docs/checks.md#pinned-dependencies
  --
  -- May be NULL, as the repository may not have been scanned.
  scorecard_pinneddependencies INTEGER
    CHECK (
      scorecard_pinneddependencies >= -1 AND scorecard_pinneddependencies <= 10
    ),

  --------------------------------------------------------------------------
  -------- Ecosyste.ms derived data

  -- ecosystems_repo_archived indicates whether the source code repository that
  -- the package is hosted at has been archived.
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the
  -- field `$.repo_metadata.archived` in the Packages API.
  --
  -- May be NULL, as the repository may not have been scanned.
  ecosystems_repo_archived BOOLEAN,
  -- ecosystems_repo_pushed_at indicates the last push to the default branch of
  -- the source code repository that manages this package.
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the
  -- field `$.repo_metadata.pushed_at` in the Packages API.
  --
  -- May be NULL, as the repository may not have been scanned.
  --
  -- Note that this may not always be up-to-date
  -- https://github.com/ecosyste-ms/repos/issues/442#issuecomment-1911818704
  ecosystems_repo_pushed_at TEXT,
  -- ecosystems_repo_updated_at indicates the last activity in the source code
  -- repository that manages this package.
  --
  -- Depending on the host of the source code, this could indicate a comments,
  -- push to a branch, or otherwise.
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the
  -- field `$.repo_metadata.updated_at` in the Packages API.
  --
  -- May be NULL, as the repository may not have been scanned.
  --
  -- Note that this may not always be up-to-date
  -- https://github.com/ecosyste-ms/repos/issues/442#issuecomment-1911818704
  ecosystems_repo_updated_at TEXT,
  -- ecosystems_repo_last_synced_at indicates the last date that Ecosystems
  -- synced repository data.
  --
  -- This is useful when used in conjunction with decisioning based on other
  -- fields, as it indicates how stale the data may be.
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the
  -- field `$.repo_metadata.last_synced_at` in the Packages API.
  --
  -- May be NULL, as the repository may not have been scanned.
  --
  -- Note that this may not always be up-to-date
  -- https://github.com/ecosyste-ms/repos/issues/442#issuecomment-1911818704
  ecosystems_repo_last_synced_at TEXT,
  -- ecosystems_last_synced_at indicates the last date that Ecosystems
  -- synced package data.
  --
  -- This is useful when used in conjunction with decisioning based on other
  -- fields, as it indicates how stale the data may be.
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the
  -- field `$.repo_metadata.last_synced_at` in the Packages API.
  --
  -- May be NULL, as the repository may not have been scanned.
  --
  -- Note that this may not always be up-to-date
  -- https://github.com/ecosyste-ms/repos/issues/442#issuecomment-1911818704
  ecosystems_last_synced_at TEXT,
  -- ecosystems_latest_release_published_at indicates when the last release for
  -- this package was published.
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the
  -- field `$.latest_release_published_at` in the Packages API.
  --
  -- May be NULL, as the repository may not have been scanned.
  --
  -- Note that this may include retracted releases
  -- https://github.com/ecosyste-ms/packages/issues/619
  ecosystems_latest_release_published_at TEXT,
  -- ecosystems_status indicates the status of the package.
  --
  -- Some possible values:
  --
  --   yanked
  --   deprecated
  --   discontinued
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the
  -- field `$.status` in the Packages API.
  --
  -- May be NULL, as the repository may not have been scanned.
  ecosystems_status TEXT,
  -- ecosystems_funding indicates the platform(s) that a given package is
  -- onboarded to, for supporters to financially support the project and its
  -- ongoing maintainence.
  --
  -- This is a a key-value JSON object of the different platforms.
  --
  -- Some possible values:
  --
  --   {}
  --
  --   {
  --     "github": "eslint",
  --     "tidelift": "npm/eslint",
  --     "package_metadata": "https://opencollective.com/eslint"
  --   }
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the following fields from the Packages API:
  --
  -- - the object `$.repo_metadata.metadata.funding`
  -- - the string `$.metadata.funding` (stored in the `package_metadata` field in the resulting object)
  --
  -- May be NULL, as the repository may not have been scanned.
  ecosystems_funding TEXT,

  UNIQUE (package_name, package_type) ON CONFLICT REPLACE
);

internal/depsdev/db/schema.sql

-- depsdev_cves contains Common Vulnerabilities and Exposures (CVE) information
-- as derived from the API of https://deps.dev for given package dependencies.
--
-- This table is a utility table, and is expected to be `JOIN`'d with
-- Datasources (https://dmd.tanna.dev/concepts/datasource/) rather than used on
-- its own.
--
-- TODO: Add package_manager
-- https://gitlab.com/tanna.dev/dependency-management-data/-/issues/397
CREATE TABLE IF NOT EXISTS depsdev_cves (
  -- the package that this CVE describes
  --
  -- Foreign keys:
  -- - `renovate.package_name`
  -- - `sboms.package_name`
  package_name TEXT NOT NULL,
  -- version is the exact version that this CVE describes
  --
  -- Foreign keys:
  -- - `renovate.current_version`
  -- - `sboms.current_version`
  version TEXT NOT NULL,

  -- the external identifier of the CVE
  cve_id TEXT NOT NULL,

  -- updated_at indicates when this CVE was last updated in
  -- dependency-management-data, not when the CVE was last updated
  updated_at TEXT NOT NULL,

  UNIQUE (package_name, version, cve_id) ON CONFLICT REPLACE
);

-- depsdev_cves contains licensing information as derived from the API of
-- https://deps.dev for given package dependencies.
--
-- This table is a utility table, and is expected to be `JOIN`'d with
-- Datasources (https://dmd.tanna.dev/concepts/datasource/) rather than used on
-- its own.
--
-- TODO: Add package_manager
-- https://gitlab.com/tanna.dev/dependency-management-data/-/issues/397
CREATE TABLE IF NOT EXISTS depsdev_licenses (
  -- the package that this license covers
  --
  -- Foreign keys:
  -- - `renovate.package_name`
  -- - `sboms.package_name`
  package_name TEXT NOT NULL,
  -- version is the exact version of this package that this license applies to.
  -- For example:
  --   1.2.3
  --   v5.6.7-somelonghash
  --
  -- Foreign keys:
  -- - `renovate.current_version`
  -- - `sboms.current_version`
  version TEXT NOT NULL,

  -- license is the SPDX Identifier or SPDX License Expression
  -- (https://spdx.dev/learn/handling-license-info/) that declares this
  -- package's license
  license TEXT NOT NULL,

  -- updated_at indicates when this license was last updated in
  -- dependency-management-data, not when the license was last updated
  updated_at TEXT NOT NULL,

  UNIQUE (package_name, version, license) ON CONFLICT REPLACE
);

-- depsdev_deprecated contains indications that a package is deprecated
-- according to signals in the upstream package, such as a `Deprecated:` line in
-- a Go module, or an npm package being marked as deprecated.
--
-- If a package is not found in this table, then it can be understood as not
-- deprecated according to package manager metadata.
--
-- This data is retrieved via https://deps.dev
--
-- This table is a utility table, and is expected to be `JOIN`'d with
-- Datasources (https://dmd.tanna.dev/concepts/datasource/) rather than used on
-- its own.
CREATE TABLE IF NOT EXISTS depsdev_deprecated (
  -- the package that this deprecation covers
  --
  -- Foreign keys:
  -- - `renovate.package_name`
  -- - `sboms.package_name`
  package_name TEXT NOT NULL,
  -- version is the exact version of this package that this deprecation applies
  -- to.
  --
  -- For example:
  --   1.2.3
  --   v5.6.7-somelonghash
  --
  -- Foreign keys:
  -- - `renovate.current_version`
  -- - `sboms.current_version`
  version TEXT NOT NULL,
  -- package_type most commonly relates to the "Type" field of a Package URL
  -- (as defined by https://github.com/package-url/purl-spec), which may be a
  -- package ecosystem or package manager type
  --
  -- Foreign keys:
  -- - `renovate.package_type`
  -- - `sboms.package_type`
  package_type TEXT NOT NULL,

  -- is_deprecated is whether the package is marked as deprecated
  is_deprecated BOOLEAN NOT NULL,

  -- updated_at indicates when this deprecation was last updated in
  -- dependency-management-data, not when the deprecation was last updated
  updated_at TEXT NOT NULL,

  UNIQUE (package_name, package_type, version) ON CONFLICT REPLACE
);

internal/endoflifedate/db/schema.sql

-- endoflifedate_products contains an internal representation of Products and
-- Cycles that https://EndOfLife.date supports.
--
-- This table can be used on its own to perform ad-hoc lookups of end-of-life
-- dates for given dependencies.
--
-- However, this table is primarily a utility table, and is expected to be
-- `JOIN`'d with Datasources (https://dmd.tanna.dev/concepts/datasource/).
--
-- For example, if you are using the SBOMs datasource, you would perform a
-- `JOIN` between the `sboms`, `sboms_endoflife` and `endoflifedate_products`
-- tables.
CREATE TABLE IF NOT EXISTS endoflifedate_products (
  -- product_name describes an EndOfLife.date Product name.
  -- For example:
  --   go
  --   ansible
  --   sqlite
  product_name TEXT NOT NULL,

  -- cycle of the given product, as defined by EndOfLife.date, and the product
  -- itself.
  --
  -- For example (non-exhaustively):
  --   sqlite has cycles: `3`, `2`, `1`
  --   go     has cycles: `1.21`, `1.20`, `1.5`
  cycle TEXT NOT NULL,

  -- supported_until describes the date that this release is (actively)
  -- supported until.
  --
  -- Either `supported_until` or `is_supported` will be specified to indicate
  -- whether the release is (actively) supported.
  supported_until TEXT,
  -- is_supported indicates whether the this release is (actively)
  -- supported until.
  --
  -- You'll likely want to query i.e. `WHERE is_supported = false`
  --
  -- Either `supported_until` or `is_supported` will be specified to indicate
  -- whether the release is (actively) supported.
  is_supported BOOLEAN,
  -- eol_from describes the date that this release will be marked as End of
  -- Life, and will no longer be maintained from
  --
  -- Either `eol_from` or `is_eol ` will be specified to indicate whether the
  -- release is End of Life, and will no longer be maintained from
  eol_from TEXT,
  -- is_eol indicates whether this release is marked as End of Life, and will
  -- no longer be maintained
  --
  -- You'll likely want to query i.e. `WHERE is_eol = true`
  --
  -- Either `eol_from` or `is_eol ` will be specified to indicate whether the
  -- release is End of Life, and will no longer be maintained from
  is_eol BOOLEAN,

  -- inserted_at indicates when this row was inserted into the database
  inserted_at TEXT NOT NULL,

  UNIQUE (product_name, cycle) ON CONFLICT REPLACE,
  CHECK(product_name <> ''),
  CHECK(cycle <> '')
);


-- renovate_endoflife contains a mapping between packages found in the
-- `renovate` table, and `endoflifedate_products`.
--
-- This table is a utility table, and is expected to be `JOIN`d to other
-- tables. See `endoflifedate_products` for more details.
CREATE TABLE IF NOT EXISTS renovate_endoflife (
  -- what package is this?
  --
  -- Foreign key: `renovate.package_name`
  package_name TEXT NOT NULL,
  -- version indicates the version of this dependency.
  --
  -- NOTE this could be a version constraint, such as any of:
  --
  --   <=1.3.4,>=1.3.0
  --   "~> 0.9"
  --   latest
  --   ^2.0.6
  --   =1.0.4
  --
  -- As well as a specific value, such as:
  --
  --   1.0.4
  --   10
  --   latest
  --
  -- This versioning will be implementation-specific for the `package_manager`
  -- in use.
  --
  -- Foreign key: `renovate.version`
  version TEXT NOT NULL,
  -- current_version defines the current version that this package's `version`
  -- resolves to.
  --
  -- If the `version` is an exact version number, such as `1.0.4`, then
  -- `current_version` will usually be the same value, `1.0.4`.
  --
  -- If the `version` is a version constraint, then this column MAY indicate
  -- the exact version that was resolved at the time of dependency analysis.
  --
  -- The `current_version` is derived from:
  -- - the `locked_version` or `fixed_version`, if there is a lockfile or some
  --   pinning set
  -- - the `current_version` if present, and can be recalculated each time the
  --   data is generated
  --
  -- Foreign key: `renovate.current_version`
  current_version TEXT,

  -- package_manager indicates the package manager that the dependency is from,
  -- which corresponds to Renovate's managers
  -- (https://docs.renovatebot.com/modules/manager/).
  --
  -- For example:
  --
  --  gomod
  --  pip_setup
  --  pip_requirements
  --  maven-wrapper
  --  html
  --  maven
  --  gradle
  --
  -- Foreign key: `renovate.package_manager`
  package_manager TEXT NOT NULL,

  -- datasource indicates the datasource that the dependency has been sourced
  -- from, which corresponds to Renovate's datasources
  -- (https://docs.renovatebot.com/modules/datasource/).
  --
  -- For example:
  --
  --   go
  --   github-tags
  --   gitlab-tags
  --   maven
  --   gradle-version
  --
  -- Foreign key: `renovate.datasource`
  datasource TEXT NOT NULL,

  -- product_name describes an EndOfLife.date Product name.
  -- For example:
  --   go
  --   ansible
  --   sqlite
  --
  -- Foreign key: `endoflifedate_products.product_name`
  product_name TEXT NOT NULL,
  -- cycle of the given product, as defined by EndOfLife.date, and the product
  -- itself.
  --
  -- For example (non-exhaustively):
  --   sqlite has cycles: `3`, `2`, `1`
  --   go     has cycles: `1.21`, `1.20`, `1.5`
  --
  -- Foreign key: `endoflifedate_products.cycle`
  cycle TEXT NOT NULL,

  UNIQUE (package_name, version, current_version, package_manager, datasource) ON CONFLICT REPLACE
);

-- sboms_endoflife contains a mapping between packages found in the `sboms`
-- table, and `endoflifedate_products`.
--
-- This table is a utility table, and is expected to be `JOIN`d to other
-- tables. See `endoflifedate_products` for more details.
CREATE TABLE IF NOT EXISTS sboms_endoflife (
  -- what package is this?
  --
  -- Foreign key: `sboms.package_name`
  package_name TEXT NOT NULL,
  -- version indicates the version of this dependency.
  --
  -- NOTE this could be a version constraint, such as any of:
  --
  --   <=1.3.4,>=1.3.0
  --   "~> 0.9"
  --   latest
  --   ^2.0.6
  --   =1.0.4
  --                   (NULL)
  --
  -- As well as a specific value, such as:
  --
  --   1.0.4
  --   10
  --   latest
  --
  -- This versioning will be implementation-specific for the `package_manager`
  -- in use.
  --
  -- NOTE that due to the quality of the tool producing the SBOM, this field
  -- may be NULL.
  --
  -- Foreign key: `sboms.version`
  version TEXT,
  -- current_version defines the current version that this package's `version`
  -- resolves to.
  --
  -- If the `version` is an exact version number, such as `1.0.4`, then
  -- `current_version` will usually be the same value, `1.0.4`.
  --
  -- NOTE that due to the quality of the tool producing the SBOM, this field
  -- may be NULL, or this may not be an exact value, but a version constraint
  -- similar to `version`.
  --
  -- Foreign key: `sboms.current_version`
  current_version TEXT,

  -- package_type most commonly relates to the "Type" field of a Package URL
  -- (as defined by https://github.com/package-url/purl-spec), which may be a
  -- package ecosystem or package manager type
  --
  -- Foreign key: `sboms.package_type`
  package_type TEXT NOT NULL,

  -- product_name describes an EndOfLife.date Product name.
  -- For example:
  --   go
  --   ansible
  --   sqlite
  --
  -- Foreign key: `endoflifedate_products.product_name`
  product_name TEXT NOT NULL,
  -- cycle of the given product, as defined by EndOfLife.date, and the product
  -- itself.
  --
  -- For example (non-exhaustively):
  --   sqlite has cycles: `3`, `2`, `1`
  --   go     has cycles: `1.21`, `1.20`, `1.5`
  --
  -- Foreign key: `endoflifedate_products.cycle`
  cycle TEXT NOT NULL,


  UNIQUE (package_name, version, package_type) ON CONFLICT REPLACE
);

internal/externaldata/db/schema.sql

-- external_licenses defines a table that can consume user-generated licensing
-- information for packages.
--
-- The prescence of external_licenses will take priority over
-- `depsdev_licenses` in reporting, as the assumption is that this
-- user-generated data is of higher quality.

-- This table will only be populated by DMD if:
--
-- - an SBOM defines a license
--
-- Otherwise, this table should be populated by your own tooling.
CREATE TABLE IF NOT EXISTS external_licenses (
  -- the package that this license covers
  --
  -- Foreign keys:
  -- - `renovate.package_name`
  -- - `sboms.package_name`
  package_name TEXT NOT NULL,
  -- version is the exact version of this package that this license applies to.
  -- For example:
  --   1.2.3
  --   v5.6.7-somelonghash
  --
  -- Foreign keys:
  -- - `renovate.current_version`
  -- - `sboms.current_version`
  version TEXT NOT NULL,
  -- package_manager indicates the package manager that the given
  -- `package_pattern` should match.
  --
  -- Based on which datasource(s) (https://dmd.tanna.dev/concepts/datasource/)
  -- you are using, this will be a different value:
  -- - for Renovate data, must exactly match `renovate.package_manager`.
  --   Note that there may be multiple `package_managers`, for instance `maven`
  --   and `gradle`, which would require two rows.
  -- - for Software Bill of Materials (SBOM) data, must exactly match `sboms.package_type`
  --
  -- If you are using multiple datasources, you will need to have one row per
  -- `package_manager`.
  --
  -- Foreign keys:
  -- - `renovate.package_manager`
  -- - `sboms.package_type`
  package_manager TEXT NOT NULL,

  -- license is the SPDX Identifier or SPDX License Expression
  -- (https://spdx.dev/learn/handling-license-info/) that declares this
  -- package's license
  license TEXT NOT NULL,

  UNIQUE (package_name, version, package_manager, license) ON CONFLICT REPLACE,
  CHECK(package_name <> ''),
  CHECK(version <> ''),
  CHECK(package_manager <> ''),
  CHECK(license <> '')
);

internal/libyear/db/schema.sql

-- libyears contains calculations for the Libyear metric across each version of
-- dependency in use.
--
-- This uses the Libyear metric, which is defined as "how many years between
-- the version we're currently using and the latest version released", and then
-- totalled across all libraries used by the project.
--
-- Repositories with a Libyear value of 0 could indicate that either all
-- dependencies are up-to-date, or that the Libyear could not be calculated and
-- are not shown.
--
-- NOTE: that this may not include all dependencies, so the number could be
-- higher than shown.
--
-- The intent of this table is to give an indication of dependnecy freshness
-- with an at-a-glance indicator, and can be used to provide a straightforward
-- metric to quantify how behind the latest updates a project is.
--
-- Note that a low number could still be problematic, for instance if a package
-- has pushed out a dozen releases in the last month, all of which include
-- significant breaking changes.
--
-- Alternatively, a repository with a low score could be simply because the
-- dependencies in use are no longer maintained.
--
-- It is worth using this metric with context where possible, or at least with
-- the above caveats in mind.
--
-- This data is a best-efforts attempt to provide this insight, and may be
-- stale at the time of fetching, via `dmd db generate libyear`, as it relies
-- upon the date at which Ecosystems retrieved the data.
--
-- Further reading:
--
--  https://chaoss.community/kb/metric-libyears/
--  https://libyear.com/
--
-- This data is sourced via Ecosystems (https://ecosyste.ms).
--
-- This is a utility table that is expected to be JOIN'd with the relevant
-- datasources you are using.
CREATE TABLE IF NOT EXISTS libyears (
  -- the package that this Libyear is calculated for.
  --
  -- Foreign keys:
  -- - `renovate.package_name`
  -- - `sboms.package_name`
  package_name TEXT NOT NULL,
  -- version indicates the version of `package_name` that this Libyear is
  -- calculated for.
  --
  -- NOTE this could be a version constraint, such as any of:
  --
  --   <=1.3.4,>=1.3.0
  --   "~> 0.9"
  --   latest
  --   ^2.0.6
  --   =1.0.4
  --
  -- As well as a specific value, such as:
  --
  --   1.0.4
  --   10
  --   latest
  --
  -- This versioning will be implementation-specific for the `package_manager` in use.
  --
  -- Foreign keys:
  -- - `renovate.version`
  -- - `sboms.version`
  version TEXT NOT NULL,
  -- current_version indicates the version of `package_name` that this Libyear
  -- is calculated for.
  --
  -- If the `version` is an exact version number, such as `1.0.4`, then
  -- `current_version` will usually be the same value, `1.0.4`.
  --
  -- If the `version` is a version constraint, then this column MAY indicate
  -- the exact version that was resolved at the time of dependency analysis.
  --
  -- Foreign keys:
  -- - `renovate.current_version`
  -- - `sboms.current_version`
  current_version TEXT,
  -- package_manager indicates the package manager that this package
  -- corresponds to.
  --
  -- Based on which datasource(s) (https://dmd.tanna.dev/concepts/datasource/)
  -- you are using, this will be a different value:
  --
  -- - for Renovate data, must exactly match `renovate.package_manager`.
  --   Note that there may be multiple `package_managers`, for instance `maven`
  --   and `gradle`, which would require two rows.
  -- - for Software Bill of Materials (SBOM) data, must exactly match `sboms.package_type`
  --
  -- If you are using multiple datasources, you will have one row per
  -- `package_manager` that this Advisory matches.
  --
  -- Foreign keys:
  -- - `renovate.package_manager`
  -- - `sboms.package_type`
  package_manager TEXT NOT NULL,

  -- libyear is the value of the Libyear metric, which is defined as "how many
  -- years between the version we're currently using and the latest version
  -- released", and then totalled across all libraries used by the project.
  --
  -- Dependencies with a Libyear value of 0 could indicate that either all
  -- dependencies are up-to-date, or that the Libyear could not be calculated and
  -- are not shown.
  --
  -- Note that a low number could still be problematic, for instance if a package
  -- has pushed out a dozen releases in the last month, all of which include
  -- significant breaking changes.
  --
  -- Alternatively, a package may have a low score simply because the package is
  -- no longer maintained, so no new releases means you are "up to date".
  --
  -- It is worth using this metric with context where possible, or at least with
  -- the above caveats in mind.
  --
  -- It is recommended to display this at a precision of 2 decimal places.
  --
  -- Further reading:
  --
  --  https://chaoss.community/kb/metric-libyears/
  --  https://libyear.com/
  libyear REAL NOT NULL
    CHECK (
      libyear >= 0
    ),

  -- version_release_date is the release datetime that this `version` / `current_version` was released at
  version_release_date TEXT NOT NULL,

  -- latest_version is the latest release of this package, as known by
  -- Ecosystems (https://ecosyste.ms).
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the
  -- field `$.latest_release_number` in the Packages API.
  --
  -- Note that this may not always be up-to-date
  -- https://github.com/ecosyste-ms/repos/issues/442#issuecomment-1911818704
  latest_version TEXT NOT NULL,

  -- latest_version_release_date is the latest release of this package, as
  -- known by Ecosystems (https://ecosyste.ms).
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the
  -- field `$.latest_release_published_at` in the Packages API.
  --
  -- Note that this may not always be up-to-date
  -- https://github.com/ecosyste-ms/repos/issues/442#issuecomment-1911818704
  latest_version_release_date TEXT NOT NULL,

  -- last_synced_at indicates the last date that Ecosystems
  -- synced package data.
  --
  -- This is useful when used in conjunction with decisioning based on other
  -- fields, as it indicates how stale the data may be.
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the
  -- field `$.repo_metadata.last_synced_at` in the Packages API.
  --
  -- May be NULL, as the repository may not have been scanned.
  --
  -- Note that this may not always be up-to-date
  -- https://github.com/ecosyste-ms/repos/issues/442#issuecomment-1911818704
  last_synced_at TEXT NOT NULL,

  UNIQUE (package_name, package_manager, version) ON CONFLICT REPLACE
);

internal/metadata/db/schema.sql

-- metadata contains internal metadata about the dependency-management-data
-- database.
--
-- This provides a common location for key-value information to be stored that
-- affects the whole database.
--
-- Known keys:
--
--  dmd_version - the version of the `dmd` CLI that was used to create the
--  underlying database
--
--  finalised_at - the date at which the database was "finalised", and should
--  now be treated as "up-to-date" and read-only from this point onwards. This
--  indicates that all datasources were imported and all enrichment (i.e. via
--  Advisories, Dependency Health, etc) was complete. This could indicate
--  that all datasources' data is now up-to-date, but there are likely some
--  that haven't been as recently scanned.
--
--  compatible_since - the earliest version of the official tooling, such as
--  `dmd`, `dmd-web` and `dmd-graph` that can be used with the given database.
--  For instance:
--  - if set to `v0.10.0` and you're running `dmd v0.10.0`, this should be compatible
--  - if set to `v0.10.3` and you're running `dmd v0.10.0`, you are likely to
--    encounter issues that are not compatible, and should align the versions -
--    for instance, this may be due to a bug that's been fixed that you may be
--    rely upon the behaviour of
--  - if set to `v0.13.0` and you're running `dmd v0.10.0`, you are likely to
--    encounter issues that are not compatible, and should align the versions -
--    for instance, this may be that a breaking change has been made to the
--    table schema
--  For more details, see https://dmd.tanna.dev/concepts/compatible-since/
--
-- Consuming applications MAY add their own data to this table, but must note
-- that there are no guarantees about the data not being overwritten by
-- dependency-management-data's own requirements in the future.
CREATE TABLE IF NOT EXISTS metadata (
  name TEXT NOT NULL,
  value TEXT NOT NULL,

  UNIQUE (name) ON CONFLICT REPLACE
);

internal/ownership/db/schema.sql

-- owners allows defining which team / person / vendor / part of your
-- organisation / etc owns a given repository. This data is expected to be
-- sourced through a Service Catalog or through some other means, and will be
-- organisation-specific.
--
-- More detail can be found in https://dmd.tanna.dev/cookbooks/ownership/
-- around how to source + query this data.
--
-- Each platform/organisation/repo combination should have one owner.
--
-- NOTE that in the future, multiple owners may be allowed
-- https://gitlab.com/tanna.dev/dependency-management-data/-/issues/112
--
-- TODO: restore SBOM support
-- https://gitlab.com/tanna.dev/dependency-management-data/-/issues/596
CREATE TABLE IF NOT EXISTS owners (
  -- what platform hosts the source code that this repo is for? i.e. `github`,
  -- `gitlab`, `gitea`, etc
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#platform
  --
  -- Foreign keys:
  -- - `renovate.platform`
  platform TEXT NOT NULL,
  -- what organisation manages the source code for this repo? Can include `/`
  -- for nested organisations
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#organisation
  --
  -- Foreign keys:
  -- - `renovate.organisation`
  organisation TEXT NOT NULL,
  -- what is the repo name?
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#repo
  --
  -- Foreign keys:
  -- - `renovate.repo`
  repo TEXT NOT NULL,

  -- owner is a free-form identifier for who owns the repository. This could be
  -- an email address, a team name, Slack channel name, etc, but should ideally
  -- be clear from this column who should be contacted about any queries about
  -- the repository
  owner TEXT NOT NULL,
  -- notes allows adding additional, optional, context around the ownership,
  -- for instance a link to a Slack channel, Confluence page, internal Service
  -- Catalog, etc. The contents will be shown verbatim to a user, and will not
  -- be interpreted as markup.
  notes TEXT,

  -- updated_at indicates when this ownership information was last updated in
  -- dependency-management-data, not when the ownership last changed between
  -- teams
  updated_at TEXT NOT NULL,

  UNIQUE (platform, organisation, repo) ON CONFLICT REPLACE
);

internal/packagemetadata/db/schema.sql

-- package_metadata contains metadata about a given
-- dependency in use.
--
-- This data is sourced via Ecosystems (https://ecosyste.ms).
--
-- This is a utility table that is expected to be JOIN'd with the relevant
-- datasources you are using.
CREATE TABLE IF NOT EXISTS package_metadata (
  -- what package is this metadata for?
  --
  -- Foreign keys:
  -- - `renovate.package_name`
  -- - `sboms.package_name`
  package_name TEXT NOT NULL,
  -- package_type most commonly relates to the "Type" field of a Package URL
  -- (as defined by https://github.com/package-url/purl-spec), which may be a
  -- package ecosystem or package manager type
  --
  -- This must be an exact match for the `package_type` in the given
  -- Datasource.
  --
  -- Foreign keys:
  -- - `renovate.package_type`
  -- - `sboms.package_type`
  package_type TEXT NOT NULL,

  -- repository_url indicates the discovered URL for the source repository this
  -- dependency lives at.
  --
  -- Sourced via Ecosystems (https://ecosyste.ms) and corresponds with the
  -- following fields from the Packages API:
  --
  -- - the string `$.repository_url`
  repository_url TEXT NOT NULL,

  UNIQUE (package_name, package_type) ON CONFLICT REPLACE
);

internal/repositorymetadata/db/schema.sql

-- repository_metadata allows tracking custom metadata around repositories,
-- separate to the `owners` table.
--
-- This metadata can provide some additional context to your repositories for
-- instance converting:
--
-- > {repo} has 8 outstanding policy violations, and is running 3 end-of-life
--   pieces of software
--
-- But with the metadata inside `repository_metadata`, we can instead reframe
-- this with:
--
-- > The public-facing repo {repo}, which is deployed to highly sensitive PCI
--   environments has 8 outstanding policy violations, and is running 3
--   end-of-life pieces of software
--
-- This additional metadata can make it possible to better understand and
-- action data within dependency-management-data.
--
-- As well as first-class metadata, it's possible to track custom
-- organisation-specific metadata that can be queried via SQLite's JSON
-- functions.
CREATE TABLE IF NOT EXISTS repository_metadata (
  -- what platform hosts the source code that this repo is for? i.e. `github`,
  -- `gitlab`, `gitea`, etc
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#platform
  --
  -- Foreign keys:
  -- - `renovate.platform`
  -- - `component_metadata.platform`
  platform TEXT NOT NULL,
  -- what organisation manages the source code for this repo? Can include `/`
  -- for nested organisations
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#organisation
  --
  -- Foreign keys:
  -- - `renovate.organisation`
  -- - `component_metadata.organisation`
  organisation TEXT NOT NULL,
  -- what is the repo name?
  --
  -- See also: https://dmd.tanna.dev/concepts/repo-key/#repo
  --
  -- Foreign keys:
  -- - `renovate.repo`
  -- - `component_metadata.repo`
  repo TEXT NOT NULL,

  -- is_monorepo indicates whether the repository is treated as a monorepo
  is_monorepo boolean not null,

  -- is_fork indicates whether this is a forked repository. This could indicate
  -- that this is a temporary repository, a long-standing fork for security +
  -- supply-chain hygiene purposes, or some other reason.
  is_fork boolean not null,

  -- repository_type is a free-form field to create enum-style data, for
  -- instance `LIBRARY` or `SERVICE`, or `EXAMPLE_CODE`.
  --
  -- This may track with your Developer Portal's own definition of a
  -- repository's type.
  --
  -- For additional usage, you can use the `repository_usage` field, or for
  -- further key-value data, use the`additional_metadata` field
  repository_type text not null,

  -- repository_usage is a free-form field to note additional information
  -- around the repository's usage, which is organisation-specific.
  --
  -- For instance, this may be enum-style data, a space-separated list of
  -- enum-style data, or a long human-readable description.
  --
  -- For additional usage, you can use the `additional_metadata` field
  repository_usage text,

  -- visibility indicates the repository's visibility in the source forge
  --
  -- NOTE that this may be straightforward if you're using a publicly hosted
  -- source forge, but if you're running on an internally run, i.e. VPN'd off
  -- source force, this field may have a slightly different interpretation
  visibility TEXT NOT NULL
    CHECK (
      visibility IN (
        'PUBLIC',
        'PRIVATE',
        'INTERNAL'
      )
    ),

  -- description is a textual description of the repo for more context, which
  -- can include links out to other systems i.e. a Service Catalog. The
  -- contents will be shown verbatim to a user, and will not be interpreted as
  -- markup
  description TEXT,

  -- additional_metadata is a JSON object of additional key-value data that can
  -- be used to provide custom organisation-specific configuration, and augment
  -- any queries for data with information around this additional metadata for
  -- instance:
  --
  -- - `last_commit_date` - the last commit date to the project
  -- - `pci_environment` - the PCI environment the application is deployed to
  -- - `customer_type` - i.e. whether it's used for government, financial
  --   users, etc
  --
  -- NOTE this must be a JSON object of key-value strings
  additional_metadata TEXT,

  UNIQUE (platform, organisation, repo) ON CONFLICT REPLACE
);

internal/sensitivepackages/db/schema.sql

-- Sensitive Packages are a way to mark a package as 'sensitive', such that the
-- package's details should not be processed through any external systems,
-- which limits the risk of sensitive metadata exposure for your organisation.
-- More details about usage can be found in
-- https://dmd.tanna.dev/cookbooks/avoiding-sensitive-package-names/
CREATE TABLE IF NOT EXISTS sensitive_packages (
  -- package_pattern defines an exact package name, or a pattern that should
  -- match a package name, indicating which package(s) are or are not
  -- sensitive. this can either be an exact match, such as:
  --   `dmd.tanna.dev`
  -- or it can include a `*` character to indicate a wildcard such as:
  --   `*/oapi-codegen`
  --   `@my-org/*'
  --   `*tanna*`
  --   `*tan*na*`
  package_pattern TEXT NOT NULL,
  -- package_manager indicates the package manager that the given
  -- `package_pattern` should match.
  --
  -- Based on which datasource(s) (https://dmd.tanna.dev/concepts/datasource/)
  -- you are using, this will be a different value:
  -- - for Renovate data, must exactly match `renovate.package_manager`.
  --   Note that there may be multiple `package_managers`, for instance `maven`
  --   and `gradle`, which would require two rows.
  -- - for Software Bill of Materials (SBOM) data, must exactly match `sboms.package_type`
  --
  -- If you are using multiple datasources, you will need to have one row per
  -- `package_manager`.
  --
  -- Foreign keys:
  -- - `renovate.package_manager`
  -- - `sboms.package_type`
  package_manager TEXT,
  -- match_strategy defines how to mark the package's sensitivity.
  -- If `MATCHES`, any packages that match `package_pattern` and `package_manager` (if set) will be marked as sensitive
  -- If `DOES_NOT_MATCH`, any packages that do not match `package_pattern` and `package_manager` (if set) will be marked as sensitive
  match_strategy TEXT NOT NULL
    CHECK (
      match_strategy IN (
        'MATCHES',
        'DOES_NOT_MATCH'
      )
    ),

  UNIQUE (package_pattern, package_manager, match_strategy)
);