Companion code: github.com/opscart/docker-security-practical-guide (tag v1.12.0)
Key Takeaways
- Hardened container images — Docker Hardened Images (DHI), Chainguard, and self-built minimal bases — are necessary but not sufficient for regulated production. The security outcomes most teams want emerge from a surrounding trust control plane, not from the image itself.
- The trust control plane is a three-layer architectural pattern — Supply Chain, Trust, Enforcement — joined by a feedback loop. The layers are roles, not products.
- The pattern is vendor-neutral. A substitution test in the companion lab shows the same Kyverno policies and audits working unchanged across DHI, Chainguard, and a self-built example signed against a project-owned identity. The verification mechanism is invariant; only the trust root changes.
- A recurring failure mode in production container deployments is governance gaps rather than image-level vulnerabilities: unsigned images get pulled, drift goes undetected, and admission policies sit in advisory mode rather than enforcing.
- The pattern carries real operational costs: no shell for live debugging, migration is more than a FROM line change, and signature paths vary by vendor in ways that surprise integrators.
- The pattern is overkill for small teams. It earns its keep where regulatory pressure, fleet size, or blast radius justifies the investment.
1. Introduction: The Hardened Image Promise
Over the past eighteen months, “hardened” or “secure-by-default” container images have moved from a security-engineering curiosity to a recognized category. Docker launched Docker Hardened Images (DHI) in May 2025; the program expanded in June; and in December, Docker open-sourced major portions of the build tooling under Apache 2.0. Chainguard Images, distroless variants from Google, and self-built minimal images all sit in the same conceptual neighborhood: ship the application binary, its runtime dependencies, and almost nothing else.
The pitch is intuitive. A typical Debian or Ubuntu base image carries hundreds of packages, most of which the application never executes. Each package is a potential CVE. Strip them out, and the attack surface shrinks proportionally. Vendors publish impressive comparison charts: dozens of high-severity CVEs in a stock Node.js image, near-zero in the hardened equivalent. Pharma and finance security teams, under perpetual audit pressure, find this difficult to argue with.
The problem is that the comparison is incomplete, not wrong. CVE count is a property of an image. Production security is a property of a system. Between the moment an image is built and the moment a workload it powers is processing traffic, half a dozen control points decide whether the hardening actually matters. Most of those control points have nothing to do with the image itself.
This article situates hardened images within the broader supply-chain security ecosystem — SLSA, Sigstore, and the CNCF supply-chain security working group — and argues that the value of hardening compounds only when wired into an architectural pattern that includes admission control and continuous drift detection. I worked on a migration of a regulated production workload onto a hardened-image baseline this year; Lab 12 of my docker-security-practical-guide repository is a sanitized, fully reproducible distillation of what that work taught me. The short version: the value is in the control plane around the image, not the image itself.
The intended audience is architects and senior engineers making image-strategy decisions for regulated workloads.
2. The Real Problem: Governance, Not Image Surface
Walk into any post-incident review for a container-related production breach. The proximate cause is almost never a CVE in the base image. A recurring set of failure modes shows up instead:
- A team pushes a debug build with a different base image to production, and admission control doesn’t block it because the policy is in Audit mode rather than Enforce.
- A long-running deployment keeps a six-month-old image digest while the team patches the new builds. Drift detection doesn’t exist.
- The platform team rotates signing keys. Pipelines that signed with the old key keep producing images that admission policy still accepts because the policy was written with a broad identity match. Nobody notices for ninety days.
- A vendor pushes an updated base image with the same tag. CI rebuilds against the new digest. The new digest is unsigned because the signing pipeline lives elsewhere. Production takes it. No alert fires.
None of these failures are CVE failures. They are governance failures — gaps in how images are produced, attested, verified, and continuously monitored across their lifecycle in the cluster. Swapping the base image to a hardened variant changes none of them. A signed-and-attested hardened image that lands in a cluster which doesn’t verify signatures is operationally equivalent to a signed Ubuntu image in that same cluster: the signature is decorative.
The conceptual move is to stop thinking of “image security” as a property of the image and start thinking of it as a property of the pipeline that produces, ships, admits, and monitors the image. That pipeline — the set of policies, attestations, verifications, and audits that together establish what’s allowed to run — is what I’m calling the trust control plane. The companion lab’s experiments (E1 and E3 in particular) produce concrete audit-log evidence of each of the failure modes above being caught by the pattern, and missed without it.
When the trust control plane is healthy, the image is just one input. When it’s absent, the image is the only line of defense, and CVE count becomes a proxy metric for something it can’t measure. The remainder of this article specifies the pattern.
3. The Trust Control Plane: A Three-Layer Pattern

Figure 1 — The architecture separates supply chain generation, admission-time trust verification, and continuous runtime enforcement into independent layers connected through a feedback loop. The pattern is vendor-agnostic: any compatible signing, admission, and drift-detection components can fulfill these roles.
The architecture has three layers:
Supply Chain layer — concerned with what was built and by whom. At minimum: the image is signed (cosign keyless against Fulcio is the current default), a Software Bill of Materials (SBOM) is produced and attached as an attestation, and a SLSA provenance attestation describes the build environment. The output is an image whose origin and contents are independently verifiable. DHI and Chainguard ship this layer pre-populated for their published images; self-built bases require you to run the layer yourself, typically in a GitHub Action or equivalent CI workflow.
Trust layer — concerned with what we’re willing to admit. This is the in-cluster verification gate. Kyverno’s verifyImages rule is the current pragmatic choice: a ClusterPolicy rejects pods whose images don’t carry a signature from an approved identity. The policy is the unit of governance — it encodes which signers, which attestations, and which constraints are required for a workload to start.
Enforcement layer — concerned with what continues to be true. Admission is point-in-time. Production is continuous. The enforcement layer answers questions admission can’t: “Is this pod still running the digest we admitted?”, “Has the signing key been revoked since admission?”, “Has new unsigned work landed via a controller that bypasses admission?”.
The feedback loop — concerned with learning. Findings from the enforcement layer should flow back into the supply chain layer: a drift finding produces a new signed digest; an admission rejection produces a ticket against the team whose pipeline produced the rejected artifact; a revoked key propagates into the policy’s identity matcher. Without the loop, the enforcement layer becomes an alerting backwater that engineers mute.
Before moving to the substitution test, a structured view of what each layer adds:
| Capability | Hardened Image Only | With Trust Control Plane |
|---|---|---|
| Reduced base CVE count | Yes | Yes |
| Admission enforcement | No | Yes |
| Drift detection | No | Yes |
| Signature lifecycle handling | No | Yes |
| Audit readiness | Partial | Strong |
| Vendor portability | No | Yes |
| Operational complexity | Low | Moderate–High |
The bottom row is deliberate: this is not a free upgrade.
4. The Substitution Test: Vendor-Neutrality as a Design Property
Substitute test illustration

Figure 2 — The substitution test. Swap the image vendor and only the supply-chain layer’s identity matcher changes. The Trust and Enforcement layers are unchanged. The pattern, not the vendor, is the asset.
A useful test for whether you’ve found an architectural pattern, as opposed to a vendor-specific recipe, is the substitution test: can you swap a major component out and have the rest of the architecture continue to work with no structural changes?
For the trust control plane, the test is: swap the hardened-image vendor. The lab demonstrates three substitutions:
Configuration A: Docker Hardened Images. Images sourced from DHI. Kyverno verifyImages rule configured with Docker’s signing identity and Fulcio’s OIDC issuer. SBOM attestations consumed from the DHI-attached attestations.
Configuration B: Chainguard Images. The Kyverno policy is identical in shape. The subject and issuer fields in the identity matcher change to Chainguard’s. The image pull strings change. Everything else — policy structure, SBOM verification, drift audit, alerting — is unchanged.
Configuration C: Self-built, project-signed example. Uses an alpine:3.19 base signed via cosign keyless against the GitHub Actions OIDC identity. The Kyverno policy’s identity matcher references that issuer and the project’s GitHub organization. Alpine is intentionally not distroless — the test demonstrates trust-root invariance independently of base image surface area.
An important nuance: The verification mechanism is identical across all three — cosign against an OIDC-issued certificate. What shifts is the trust root: vendor identity in A and B, project-owned GitHub Actions identity in C. The mechanism is invariant; the root authority is not.
In all three configurations, the same analyze-drift.py runs unchanged. The same Kyverno admission flow runs unchanged. The same SBOM consumption runs unchanged. Edits are confined to the identity matcher and the image references themselves.
This is what I mean by saying the pattern is the asset. A team that has invested in the trust control plane has built portable institutional capability. A team that has invested in “we use DHI” has bought a product, and a future migration off DHI — for cost, for relationship, for regulatory geography — is a structural rewrite rather than a configuration update.
The substitution test also clarifies an evaluation question I get asked frequently: “Should we standardize on DHI or Chainguard?” The architectural answer is that the choice matters less than the question implies, provided the surrounding control plane is in place. The decision criteria become commercial — pricing, support, image-catalog coverage, regulatory geography — rather than architectural.
5. Supply Chain Layer — Provenance and Signing
The Supply Chain layer’s job is to produce an image whose origin and contents are independently verifiable. This is where the broader supply-chain security ecosystem lives: the SLSA framework defines a maturity ladder (Levels 1–4) for build-system integrity; Sigstore provides the keyless signing primitives (cosign, Fulcio, Rekor) that have become the de-facto standard; and the CNCF Security Technical Advisory Group maintains the Software Supply Chain Best Practices guide that ties these together. The pattern in this article is one concrete instantiation of those references; teams building their own should treat SLSA and Sigstore as the authoritative specifications.
In the lab, three artifacts are co-produced for every image:
The image itself, pushed to the container registry by digest.
A cosign signature, produced via keyless signing against a Fulcio-issued certificate tied to an OIDC identity. For self-built images, the OIDC identity is the GitHub Actions workflow identity. For DHI and Chainguard images, the identity is the vendor’s signing pipeline.
A vendor note: DHI signatures resolve via Docker Scout’s registry infrastructure, not the image’s own registry path. Kyverno handles this through the policy’s repository field; custom verification tooling needs explicit configuration.
An SBOM attestation, generated with syft from the built image and attached via cosign attest as an in-toto attestation. The SBOM lists every package and version present in the image. In the trust layer, Kyverno can require SBOM presence; in the enforcement layer, the SBOM is what the vulnerability audit reads when it correlates running images against known CVEs.
A SLSA provenance attestation is the next layer of maturity (SLSA Level 3 requires it). The lab includes a SLSA-provenance variant for teams that need it; the Trust layer’s policy can require it just as readily as the SBOM.
The lab’s pipeline implements a full build → push → sign → attest → verify flow that fails closed if verification breaks. The complete workflow and run history is public.
Verification, end-to-end, looks like this on the command line:
| Verification confirms the cosign claims, transparency log existence, and signature match against the specified identity (see Figure 3). |
The combination — image, signature, SBOM, optionally provenance — is what the registry holds. Each artifact is content-addressed and tied back to the image digest. The Supply Chain layer’s contract with the rest of the system is: here is an image, and here is verifiable evidence about how it came to be. The Trust layer consumes that evidence.
Cosign verify success

Figure 3 — A successful cosign verify against the lab’s keyless-signed sample image. The same machinery, with vendor-specific identities, runs inside Kyverno at admission time.
6. Trust Layer — Verification at Admission
The Trust layer’s job is to admit only workloads whose evidence satisfies policy. In Kubernetes today, the practical mechanism is an admission controller. In the lab, that controller is Kyverno, configured with a ClusterPolicy whose verifyImages rule asserts that every image carries a cosign signature from an allow-listed identity. A trimmed extract from the lab’s require-signed-images policy:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-signed-images
spec:
validationFailureAction: Enforce
rules:
- name: verify-cosign-keyless
match:
any:
- resources:
kinds: [Pod]
verifyImages:
- imageReferences: ["ghcr.io/opscart/*"]
attestors:
- entries:
- keyless:
subject: "https://github.com/opscart/*"
issuer: "https://token.actions.githubusercontent.com"
required: trueThe subject and issuer together define the identity matcher; the wildcard in subject is narrow enough to limit the trust root to the project’s GitHub organization. For DHI, the same rule’s matcher would point at Docker’s signing identity and Fulcio issuer; for Chainguard, Chainguard’s. The shape of the policy is invariant across configurations.
A note on alternatives. OPA/Gatekeeper can perform image verification, but the path is less direct: Gatekeeper’s constraint framework is general-purpose, so signature verification typically requires external attestation providers or the Ratify project as an integration layer. Kyverno’s verifyImages is a first-class primitive, which is why this lab uses it.
A second policy in the lab, require-sbom-attestation, is composed alongside the first: it asserts that an in-toto SBOM attestation is present and signed by the same identity. Together they form the admission contract: signed, by an approved identity, with an SBOM attestation. A pod failing either check is rejected by the admission webhook before reaching kubelet.
Kyverno admission rejection

Figure 4 — A Kyverno admission rejection. The policy names the violating image and the specific rule; the user gets immediate, actionable feedback at deploy time rather than discovering misconfiguration in production.
The architectural reason this matters: admission is the single point at which workload intent is most cheaply rejected. Catching an unsigned image at admission costs the user one re-run of kubectl apply. Catching the same workload running in production a week later costs a security ticket, an incident response, and a regulatory disclosure conversation. Moving rejection earlier in the lifecycle is one of the highest-leverage architectural decisions in the entire pattern.
This is the heart of the Trust layer: declarative, version-controlled, auditable policy that defines what counts as “trusted enough to run”. Everything upstream produces evidence; everything downstream consumes the assertion that the evidence was acceptable.
7. Enforcement Layer — Continuous Drift and Vulnerability Audits
Admission is necessary but insufficient. A cluster admits a workload at time T₀ based on the state of policy and the signature at T₀. Production runs at T₀ + days or T₀ + months. The Enforcement layer asks the continuous-time questions: Has the digest drifted? Has the signing key been revoked? Has a new unsigned workload landed via a controller that bypasses admission? Are running images carrying SBOMs with newly-public CVEs?
In the lab, two complementary audits cover this surface area:
`analyze-drift.py` walks every namespace, collects every running pod’s image references and digests, and compares them against a manifest of last-known-good signed digests. It produces a structured report: which workloads are in compliance, which have drifted, which are unsigned, which lack SBOMs. This is the architectural Enforcement layer in the strict sense — it audits trust-control-plane invariants over time.
`audit-fleet.sh` is a complementary scanner-driven audit. It runs a vulnerability scanner across the fleet’s running images and aggregates the findings by cohort. Where analyze-drift.py answers “is what we admitted still true?”, audit-fleet.sh answers “what vulnerabilities exist in what we admitted?”.
Fleet composition and the signing/CVE correlation

Fleet drift audit: signing-state vs CVE correlation across a 12-service synthetic fleet. In this constructed variation matrix, unsigned services averaged 13.0 critical CVEs vs 0.0 for signed-verified — a gap the audit surfaces continuously and attributably.
Risk concentration, team attribution, and remediation order

Figure 5b — Business-context attribution: findings assigned by compliance scope, owning team, SLA window, and remediation priority. The Enforcement layer’s job is not to find vulnerabilities but to route them into the Supply Chain layer with enough context for action.
A measured result from the audit on this fleet: unsigned services average 13.0 critical CVEs, signed-verified services average 0.0 — the audit itself emits the comparison as an insight line (“Unsigned services average 130× more critical CVEs than signed_verified”). The fleet intentionally mixes DHI, Docker Hub, internally-built, and abandoned images to validate audit discrimination across the realistic origin landscape regulated teams face. This is a controlled synthetic-fleet measurement, not field data, and the ratio reflects the specific package mix and the scanner’s database state at the time of testing. What the pattern provides is not a guaranteed ratio but the continuous, attributable surfacing of whatever the ratio actually is — including, importantly, the cases where it is small and the supposed benefit of hardening is harder to defend.
In production, both audits run on a schedule (the lab includes CronJob manifests). Findings are emitted as Kubernetes events and can flow into any alerting backend. The crucial architectural property is that findings have an owner: the team whose workload drifted or whose dependencies acquired new CVEs, with a clear remediation path that re-engages the Supply Chain layer.
8. Threat Model: What This Pattern Addresses, and What It Doesn’t
An architectural pattern earns credibility by being explicit about which threats it counters and which it doesn’t. The trust control plane addresses a defined slice of the container-security threat surface; running it is not equivalent to “secured.”
Threats the pattern addresses:
- Supply-chain tampering of images. A malicious image substituted into the registry under the expected tag will fail signature verification at admission and be rejected. The same image, if pulled before signature checks are added, will be flagged by analyze-drift.py because its digest does not appear in the signed-digests manifest.
- Unsigned-image execution. Any pod referencing an image without a valid signature from an approved identity is rejected at admission. This catches the most common production gap: well-meaning workloads that were never wired into the signing pipeline.
- Unauthorized registry use and in-cluster drift. The identity matcher rejects workloads from unapproved registries at admission; the continuous drift audit catches digest changes caused by mutating webhooks or controller misbehavior after admission has passed.
- Stale trust under key rotation. When signing keys or OIDC subjects change, the audit surfaces workloads still tied to revoked identities.
- SBOM-absent workloads. Workloads without attached SBOM attestations cannot be admitted, ensuring downstream vulnerability inventories are possible.
*Threats the pattern does not address:*
- Runtime kernel exploits and container escape. Once a verified image is running, kernel-level CVEs (e.g., past dirtypipe-class vulnerabilities) are outside this pattern’s scope. Defense-in-depth here is gVisor, Kata Containers, microVMs, or kernel patching.
- Application-level vulnerabilities. A signed image whose application code has a SQL-injection or SSRF flaw will admit and run normally. The trust control plane is silent on application security; that is WAF, SAST, DAST, and runtime application-security tooling.
- Insider compromise of signing keys. If an attacker controls the signing identity (a stolen OIDC token, a compromised GitHub Actions secret, a malicious maintainer with commit rights), the pattern’s verification will pass. Mitigations here are signer hardening, hardware-backed identities, Rekor transparency-log monitoring, and identity-bound build environments.
- Compromised CI environment. A trusted CI pipeline that has been tampered with can sign malicious images using legitimate identity. SLSA Level 3+ provenance attestations are the architectural defense; this lab includes the optional provenance variant but does not enforce SLSA Level 3 by default.
- Side-channel and data-exfiltration paths. Network policy, egress filtering, and pod security standards address these; the trust control plane does not.
This list is not exhaustive but is sufficient for credibility: a workload deployed through the trust control plane is materially safer against supply-chain and governance threats, and no safer against the threats in the second list than any other Kubernetes workload. The pattern composes with — does not replace — runtime security, network policy, and application-level controls.
9. Production Friction: What Hardened Images Actually Cost You
The model works. It is also not free. An honest article about hardened images in regulated production has to enumerate the friction, because the friction is what turns “we’ll roll this out next quarter” into “we abandoned the migration after two months.” The companion lab’s TROUBLESHOOTING.md is the long version of this section; what follows are the highlights.
No shell. Truly distroless hardened images — DHI, Chainguard, Google’s distroless variants — don’t include /bin/sh. They don’t include curl, wget, cat, or ls. (The lab’s worked self-built example uses Alpine, which retains a busybox shell; the friction below applies in full force when migrating to vendor distroless bases.) When an engineer pages at 2 a.m. and reflexively runs kubectl exec -it offending-pod — /bin/sh, the command fails. The remediation is kubectl debug with an ephemeral debug container — typically busybox — attached to the running pod’s process namespace. Train your on-call rotation on kubectl debug before migration, not after.
Migration is not a `FROM` line change. Hardened images differ from Debian/Ubuntu bases in numerous small ways that compound. The default user is typically a non-root nonroot (UID 65532) rather than root; this breaks any image that writes to /. Library paths differ; binaries built against glibc on Debian may fail against musl or against distroless’s minimal glibc. Required system packages — ca-certificates, timezone data, locales — that come for free in stock bases must be explicitly carried over. The lab’s worked example migrates a Python service onto a DHI base in a multi-stage Dockerfile — not a one-line FROM change.
Signature paths vary. As noted in Section 5, DHI signatures were observed resolving via registry.scout.docker.com rather than at the image’s own registry path. Naive verification (cosign verify <image> with no –repository flag) will fail on DHI images. Kyverno handles this through the policy’s repository field, but bespoke tooling needs to know. Plan to audit any custom verification code before migration; expect to update three to five locations even in a modest platform.
Observability tooling. Some APM agents expect a shell in the parent container namespace; init container patterns work around this but each integration needs verification. DHI free-tier images may have pull-rate limits or catalog restrictions; teams needing SLA-backed support or the full catalog should evaluate Docker’s commercial tiers. Namespace-level pull secrets need planning before migration starts.DHI free-tier images may have pull-rate limits or catalog restrictions; teams needing SLA-backed support or the full catalog should evaluate Docker’s commercial tiers. Namespace-level pull secrets need planning before migration starts.
None of these is a deal-breaker. All of them together are why migrations slip.
10. When This Pattern Is Overkill
The honest counter-question to “should we build a trust control plane?” is “for which workloads?” The investment’s value scales with three factors:
Regulatory pressure. HIPAA, PCI-DSS, SOC 2 Type II, FDA 21 CFR Part 11, GDPR’s processor obligations, and equivalents elsewhere increasingly require evidence-based assurance about software supply chains. The trust control plane is what produces that evidence in a form auditors accept. For workloads under these regimes, the pattern is an audit-cost reduction, not just a security investment.
Fleet size and heterogeneity. A platform team operating 8+ production clusters across multiple business units, with hundreds of microservices and dozens of independent teams pushing images, faces a coordination problem that admission policy solves elegantly. A two-person startup with a monolith on a single VPC has a coordination problem of a fundamentally different kind, and admission policy is mostly bureaucracy.
Blast radius. What happens if a single workload is compromised? For an internal dashboard read by five engineers, the answer is “we restore from backup and write a postmortem.” For a service handling pharmaceutical patient identifiers, the answer involves regulators, lawyers, and disclosure timelines. The investment justifies itself only where the loss case is asymmetric.
Concretely: pre-production internal tools, side projects, prototypes, and developer sandboxes do not need this pattern. They benefit from a hardened base image (free) and should not be put behind the full trust control plane, because the overhead of policy maintenance, key rotation, and drift remediation outstrips the risk reduction.
The pattern earns its keep on production workloads in regulated environments, on platforms with enough scale that centralized policy is cheaper than per-team review, and on services where the loss case is large. For most other workloads, the Supply Chain layer alone — sign and SBOM your builds — captures most of the available value at a small fraction of the cost.
11. Conclusion: Architecture Over Image Choice
Hardened images are useful. The point of this article is not that they aren’t. The point is that they are one component of a broader architectural pattern — the trust control plane — and that the security outcomes regulated teams want are properties of the pattern, not of the component.
A team that adopts hardened images without the surrounding pattern has made a real but limited improvement. A team that adopts the pattern with any reasonable image vendor — DHI, Chainguard, or a self-built base signed by a project-owned identity — has built portable, vendor-neutral institutional capability. The substitution test is what distinguishes the two situations: ask whether a future migration away from your current image vendor is a configuration edit or a structural rewrite. If it is the former, you have the pattern. If it is the latter, you have a product dependency.
The companion repository (github.com/opscart/docker-security-practical-guide, tag v1.12.0) is the canonical, reproducible evidence for everything in this article: working Kyverno policies, signed sample images you can pull and verify yourself, both audits, the substitution-test configurations for all three image vendors, and the troubleshooting log of every production friction point worth flagging. Five hypothesis-driven experiments demonstrate the pattern’s properties under controlled conditions.
The take-away for architects making image-strategy decisions: spend the design effort on the pattern. The image will be replaceable. The long-term value of the pattern is that governance survives vendor replacement.

