@bitwarden/cli Scored 92/100. It Just Got Compromised.

Nine maintainers. Seven years of development. 78K weekly downloads. A behavioral trust score of 92 out of 100. And today, a malicious version landed on the npm registry via a compromised CI/CD pipeline. Here’s what that exposes — and what it doesn’t.

Today, Socket.dev reported that @bitwarden/cli was compromised as part of an ongoing Checkmarx supply chain campaign. Attackers abused a GitHub Action in Bitwarden’s CI/CD pipeline to inject malicious code into version 2026.4.0. The payload scraped GitHub tokens from Runner.Worker memory, exfiltrated npm tokens, and injected persistence code into ~/.bashrc and ~/.zshrc.

This was not a typosquatting attack. There was no b1twarden-cli or bitwarden-c1i. Attackers hit the official package — the one with the verified maintainer team, the corporate backing, the years of documented releases.

Here’s what proof-of-commitment shows for it:

$ npx proof-of-commitment @bitwarden/cli

Package          Risk      Score  Maintainers  Downloads    Age
────────────────────────────────────────────────────────────────────
@bitwarden/cli   ✅ SAFE   92     9            77,980/wk    7.9y
  └ longevity=25 momentum=17 releases=20 maintainers=15 github=15

92 out of 100. SAFE. No flags. This score is correct — and that’s exactly the problem.

Two different threat models

Behavioral scoring like proof-of-commitment measures structural risk: the profile of a package that, by its nature, is dangerous to depend on. Single maintainer with massive download volume. Abandoned packages still pulled by millions of projects. No organizational backing. Stale codebases with no release in years.

The Bitwarden attack is something different: active supply chain compromise. The package was structurally sound. The organization, the maintainer team, the release history — all legitimate. The attack vector was the build pipeline itself.

These are two distinct threat classes, and they require different tooling:

Threat type Example Detected by behavioral scoring?
Single-maintainer concentration axios (1 maintainer, 100M/wk) ✅ Yes — CRITICAL flag
Abandonment risk left-pad, event-stream ✅ Yes — daysSinceLastPublish
Typosquatting bitarden-cli vs @bitwarden/cli ✅ Partially — new package with 0 history
CI/CD pipeline compromise This attack ❌ No — scores the package, not the build
Account takeover Compromised maintainer credentials ❌ No — same limitation

What actually catches CI/CD pipeline attacks

The npm ecosystem has a real answer for this: build provenance. Since 2023, npm supports provenance statements — cryptographically signed attestations that link a published package to its source repository and build environment. When a package has provenance, you can verify that version 2026.4.0 was built from commit abc123 in a specific GitHub Actions workflow run.

A compromised GitHub Action that publishes outside the expected workflow breaks the provenance chain. npm audit signatures can detect this.

Run this right now:

npm audit signatures

It checks every installed package for valid provenance. Packages without provenance get flagged as “Missing Provenance” (not a vulnerability, but a signal). Packages with forged or broken provenance fail the check. As of today, @bitwarden/cli’s current version is clean — provenance was not a vector in this attack, but the tooling exists to catch build-pipeline tampering.

The honest picture

Here’s what I tell people who ask what proof-of-commitment catches:

It catches the risk profile that makes an attack worth launching — the structural conditions that create high-value targets. A package with 100 million weekly downloads and one maintainer is worth attacking because the blast radius is enormous and the attack surface is concentrated. Behavioral scoring quantifies that.

It does not detect in-progress attacks. That requires a different layer: runtime analysis, provenance verification, or tools like Socket.dev that actively scan new publishes for malicious code patterns.

The supply chain security stack that actually works is layered:

  1. Structural scoring (proof-of-commitment) — know your risk profile before an attack
  2. Provenance verification (npm audit signatures) — verify builds come from expected pipelines
  3. Real-time publish scanning (Socket, Snyk) — catch malicious code in new releases
  4. Lock file pinning — don’t pull new versions automatically

The typosquatting difference

The one thing behavioral scoring does catch that provenance doesn’t: brand-new malicious packages impersonating legitimate ones. When attackers register bitawarden-cli or @bitwarden/cIi (capital I instead of lowercase L), those packages have no history. Score of 0. Zero downloads before the attack. Published this week. That’s a different signal, and it’s detectable.

The Bitwarden attack was more sophisticated: it targeted the real package. That requires the provenance layer.

What we’re adding

We’re working on adding provenance status to the scoring output — flagging packages that should have provenance (large organizations, high download volume) but don’t. This doesn’t catch active attacks, but it surfaces the gap in your audit trail before something goes wrong.

Check your current dependencies:

# Structural risk
npx proof-of-commitment --file package.json

# Provenance (built into npm)
npm audit signatures

Run both. They answer different questions. Both answers matter.


proof-of-commitment is open-source and free to use. GitHub · Web audit · API docs