Proof-of-Commitment Internals

The scoring algorithm, the CRITICAL flag, the bulk download optimization, and real benchmark data for chalk, express, and hono. All public data. All reproducible.

npm audit is a CVE scanner. It queries a database of known vulnerabilities against the package versions in your lockfile. When a CVE is filed, catalogued, and propagated, it will appear. Before that: silence.

The problem with that model is that supply chain attacks don't announce themselves. Before the ua-parser-js attack (October 2021, 8M weekly downloads), before the LiteLLM attack (March 2026), before the Bitwarden CLI incident — every tool returned clean. The structural preconditions for compromise were visible in public registry data. The tools just weren't looking.

Proof-of-commitment measures those preconditions. Here's how the scoring works, from registry data to CRITICAL flag.

Five Dimensions, All Public Data

Every package gets scored across five behavioral dimensions. No proprietary data sources, no scraping, no access required beyond the public npm registry and GitHub API.

1. Longevity (25 pts)

Package age in years, from pkg.time["created"] in the registry response. Time is an unfakeable signal — you can't buy 10 years of operational history.

function scoreLongevity(ageYears: number): number {
  if (ageYears >= 6) return 25;
  if (ageYears >= 4) return 20;
  if (ageYears >= 2) return 14;
  if (ageYears >= 1) return 8;
  if (ageYears >= 0.5) return 4;
  return 1;
}

The scoring is deliberately nonlinear. The difference between 1 and 6 years isn't linear in risk — a 6-year-old package is embedded in thousands of production systems. A 1-year-old package mostly isn't.

2. Download Momentum (25 pts)

Weekly download volume sets the blast radius. Trend direction (growing vs. declining) captures attacker attention — growing packages attract more eyes, malicious and otherwise.

function scoreDownloads(
  weeklyAvg: number,
  trend: "growing" | "stable" | "declining" | null
): number {
  let base = 0;
  if (weeklyAvg >= 1_000_000) base = 22;
  else if (weeklyAvg >= 100_000) base = 18;
  else if (weeklyAvg >= 10_000) base = 14;
  else if (weeklyAvg >= 1_000) base = 10;
  else if (weeklyAvg >= 100) base = 6;
  else base = 3;

  const trendMod = trend === "growing" ? 3 : trend === "declining" ? -3 : 0;
  return Math.max(0, Math.min(25, base + trendMod));
}

Trend is derived from the last 90 days of daily download data: first half vs. second half, with >15% change as the threshold. Growing = >1.15×, declining = <0.85×.

3. Release Consistency (20 pts)

Version count plus days since last publish. The version count captures maintenance history. Recency catches zombie packages — still widely depended on, no longer actively maintained, old access still live.

function scoreReleases(versionCount: number, daysSincePublish: number): number {
  let base = 0;
  if (versionCount >= 100) base = 15;
  else if (versionCount >= 30) base = 12;
  else if (versionCount >= 10) base = 9;
  else if (versionCount >= 3) base = 6;
  else base = 3;

  const recency =
    daysSincePublish < 30 ? 5 :
    daysSincePublish < 90 ? 3 :
    daysSincePublish < 365 ? 1 : 0;

  return Math.min(20, base + recency);
}

chalk scores 13/20 here: 181 days since last publish. That's not suspicious on its own — chalk is stable, mature, intentionally infrequent. But it contributes to a risk picture that's primarily set by the next dimension.

4. Maintainer Depth (15 pts)

This is the signal that drives the CRITICAL flag. Maintainer count comes directly from pkg.maintainers in the registry response — the list of npm accounts with publish rights.

function scoreMaintainers(count: number): number {
  if (count >= 5) return 15;
  if (count >= 3) return 11;
  if (count >= 2) return 7;
  if (count === 1) return 4;
  return 0;
}

A sole maintainer scores 4/15 — the lowest non-zero value. The reasoning is structural: one compromised npm token means a malicious version goes to every downstream install. One phishing email, one credential leak, one maintainer account selling their package — and the blast radius is the full weekly download count.

5. GitHub Backing (15 pts)

If the package has a linked GitHub repository, we score that repository independently across five sub-dimensions: longevity, recent commit activity (last 30 days), contributor count, release cadence, and stars. The resulting 0–100 GitHub score maps linearly to 0–15 points.

Organization-backed repos score higher than personal repos for the same reason: organizational accounts have multiple people with access, institutional continuity, and usually internal security practices. A personal repo is one account away from full control by whoever compromises it.

The CRITICAL Flag

The flag has two conditions, both must be true:

  1. Single maintainer (maintainerDepth = 4/15)
  2. Weekly downloads > 10M
const riskFlags: string[] = [];
if (profile.maintainerCount === 1 && weeklyDownloads > 10_000_000) {
  riskFlags.push("CRITICAL");
}

The threshold is explicit and deterministic. You can reproduce it from the npm registry and downloads API with no proprietary data. The reasoning behind 10M: below that volume, a compromised package causes real damage but doesn't constitute an infrastructure-level event. Above it, the blast radius is broad enough that a well-resourced attacker has a meaningful incentive.

Two additional flags exist for adjacent risk profiles:

  • HIGH: package <1 year old with >1M weekly downloads — rapid adoption without operational track record.
  • WARN: package hasn't published in >365 days — likely unmaintained but still receiving traffic.

How the Audit API Works

The POST /api/audit endpoint takes a list of package names and returns scored profiles. The naive implementation — firing one download API request per package concurrently — hits a documented npm rate limit: parallel requests to the downloads API return zeros. This was the first failure mode we hit in production.

The fix: batch all unscoped packages into a single bulk request before processing.

// One bulk HTTP request for all unscoped npm packages
const bulkWeekly = await bulkFetchNpmWeeklyDownloads(unscopedNpm);

// All packages processed concurrently — downloads already resolved
const allResults = await Promise.all(
  packages.map(async (pkg) => {
    const preloadedWeekly = pkg.startsWith("@")
      ? undefined                      // scoped: fetch individually
      : bulkWeekly.get(pkg);           // unscoped: use bulk result
    const profile = await buildNpmCommitmentProfile(pkg, preloadedWeekly);
    // ...
  })
);

The bulk API (/downloads/point/last-week) accepts up to 128 packages per request and returns a map of package → weekly count. Scoped packages (@scope/name) are not supported by the bulk endpoint — they fall back to individual fetches.

The registry metadata fetch and GitHub scoring run concurrently per package. For a 20-package audit, this is roughly: 1 bulk download request + 20 registry requests + up to 20 GitHub requests, all in parallel. Wall-clock time on cold CF Workers: typically 800ms–1.5s depending on GitHub API latency.

Benchmarks: Real Packages, Live Data

Actual response from the live API (April 29, 2026):

curl -X POST https://poc-backend.amdal-dev.workers.dev/api/audit \
  -H "Content-Type: application/json" \
  -d '{"packages": ["chalk", "express", "hono"]}'
Package Score Maintainers Weekly DL Age Last Publish Risk
chalk 75 1 418M 12.7 yr 181 days ago 🔴 CRITICAL
hono 79 1 34.5M 4.4 yr 4 days ago 🔴 CRITICAL
express 94 5 95M 15.3 yr 12 days ago ✅ No flag

Score breakdowns:

chalk:
  longevity:           25/25  (12.7 years)
  downloadMomentum:    22/25  (418M/week)
  releaseConsistency:  13/20  (181 days since last publish)
  maintainerDepth:      4/15  (1 maintainer)
  githubBacking:       11/15

hono:
  longevity:           20/25  (4.4 years)
  downloadMomentum:    22/25  (34.5M/week)
  releaseConsistency:  20/20  (4 days since last publish)
  maintainerDepth:      4/15  (1 maintainer)
  githubBacking:       13/15

express:
  longevity:           25/25  (15.3 years)
  downloadMomentum:    22/25  (95M/week)
  releaseConsistency:  20/20  (12 days since last publish)
  maintainerDepth:     15/15  (5 maintainers)
  githubBacking:       12/15

chalk and hono both score in the 75–80 range and look reasonably healthy by conventional package health metrics. Both are CRITICAL. The difference with express is one number: 5 maintainers vs. 1.

What This Doesn't Cover

Code analysis. Proof-of-commitment doesn't inspect package contents. Socket.dev does this — static analysis of published code for suspicious patterns. These are complementary layers, not competitors. If you want to catch malicious code after it's been published, Socket.dev is the right tool. If you want to identify packages structurally positioned for compromise before any malicious code exists, use Commit.

CVE tracking. npm audit, Snyk, and Dependabot scan for known vulnerabilities. Commit doesn't duplicate this. Both layers serve different threat models: known CVEs (catalogued, after-the-fact) vs. structural exposure (predictive, before-the-fact).

Trajectory, not snapshot. The current implementation scores the current state of a package. A package that had 5 maintainers for 10 years and just dropped to 1 gets the same maintainerDepth score as one that's always had a single maintainer. This is a known gap — the score should track transitions, not just current state.

CRITICAL packages that never get attacked will always outnumber the ones that do. The flag identifies exposure, not certainty. Most sole-maintained packages with 100M weekly downloads are run by talented, security-conscious people. The risk is structural, not behavioral.

Run It On Your Stack

CLI:

npx proof-of-commitment --file package.json

API (replace with your packages):

curl -X POST https://poc-backend.amdal-dev.workers.dev/api/audit \
  -H "Content-Type: application/json" \
  -d '{"packages": ["chalk", "zod", "axios", "express", "hono"]}'

Browser: getcommit.dev/audit

The source is at github.com/piiiico/proof-of-commitment. The scoring functions are in src/backend/npm.ts. The audit endpoint is in src/backend/worker.ts.


Related: How Commit Scores npm Packages: The Methodology — the reasoning behind the weights. Why npm audit Returns Zero Vulnerabilities for the Most Dangerous Packages — where each tool fits in the stack. Two Types of npm Supply Chain Attack — what each tool actually covers.