Specification v1.0.0

Trust Scoring
Methodology

This document defines the scoring methodology used by Commit to assess structural trust in open-source packages and repositories. Every weight, threshold, and flag is published here. Scores are reproducible from public data.

Version 1.0.0
Date 2026-04-19
Status Active
Versioning Semver

01

Design Principles

Structural over behavioral

We score the container, not the contents. Commit does not analyze source code, scan for malware, or check for known CVEs. It measures the structural conditions that make a package a high-value target — or a resilient dependency.

Predictive over reactive

The goal is to identify targets before compromise, not to detect active attacks. A sole maintainer controlling 100M weekly downloads is a structural risk regardless of whether an attack has occurred.

Complementary to CVE scanning

Commit operates on different data than Snyk, Socket, or npm audit. Those tools detect known vulnerabilities and malicious code. Commit maps the structural conditions that produce the next vulnerability. Use both.

Open methodology, reproducible scores

Every weight and threshold is published in this specification. All input data comes from public APIs (npm registry, PyPI, GitHub). Given the same input data, the same score must be produced.

No declarations, only behavior

Badges, certifications, and self-reported security practices are excluded. Only observable, verifiable behavioral signals are scored. A package that claims best practices but has one maintainer and no releases in two years is scored on the latter.


02

Package Scoring: npm & PyPI

Package scores range from 0–100. Five dimensions are evaluated independently, then summed. The score reflects structural trust — the degree to which the package's maintenance structure is resilient to single points of failure.

25 pts

Longevity

Time is a cost signal. A package that has existed for six years has survived shifts in ecosystem preference, maintainer fatigue, and the natural decay of abandoned projects. Longevity is the strongest Lindy indicator available from registry metadata.

Package age Points
≥ 6 years25
4–6 years20
2–4 years14
1–2 years8
0.5–1 year4
< 6 months1
25 pts

Download Momentum

Download volume measures ecosystem adoption. But absolute count alone rewards incumbency. Download Momentum combines volume with a 90-day trend analysis — rewarding growing adoption and penalizing decline. The trajectory matters as much as the snapshot.

Base score is determined by weekly download volume (npm) or daily downloads (PyPI). A trend modifier of ±3 points is then applied based on the 90-day trajectory. Trend is computed by comparing the first 45 days to the last 45 days: growing if ratio > 1.15, declining if < 0.85, stable otherwise. Final score is clamped to 0–25.

npm (weekly downloads)

VolumeBase
≥ 1,000,00022
≥ 100,00018
≥ 10,00014
≥ 1,00010
≥ 1006
≥ 103
< 100

PyPI (daily downloads)

VolumeBase
≥ 5,000,00022
≥ 500,00018
≥ 50,00014
≥ 5,00010
≥ 5006
≥ 503
< 500
Trend modifier Growing: +3   Stable: 0   Declining: −3
20 pts

Release Consistency

A healthy package publishes regularly. Total version count demonstrates sustained investment over time. Recency of the last publish signals whether the project is actively maintained or dormant.

Version count (base)

VersionsBase
≥ 10015
≥ 3012
≥ 109
≥ 36
≥ 13

Recency bonus

Last publishBonus
< 30 days+5
30–90 days+3
90–365 days+1
> 365 days0

Final score is clamped to 0–20.

15 pts

Maintainer Depth

Credential concentration is the single strongest structural risk signal in package ecosystems. A sole maintainer is not a quality judgment — it means one set of credentials controls all publish access. The more maintainers, the more organizational resilience. This dimension measures bus factor and credential distribution.

MaintainersPoints
≥ 515
3–511
27
14
00
15 pts

GitHub Backing

When a package links to a GitHub repository, that repository provides independent signals about project health: contributor count, commit activity, release cadence, and community engagement. This dimension maps the linked repository's commitment score (0–100) to a 0–15 range.

github_backing = (github_repo_score / 100) × 15 If no linked repository: 0 points

Score Composition

25
25
20
15
15
Longevity Momentum Release Maintainer GitHub

03

Risk Flags

Risk flags operate independently of the numerical score. A package can score 89/100 and still carry a CRITICAL flag. This is by design: a high score means the package is well-established and widely adopted. A CRITICAL flag means one credential controls that entire surface area. Both statements are true simultaneously.

CRITICAL Single maintainer AND >10M weekly downloads

Maximum credential concentration at catastrophic blast radius. One compromised account affects the entire downstream dependency tree at infrastructure scale.

Historical validation: axios (WAVESHAPER.V2, March 31 2026), event-stream (2018), ua-parser-js (2021). Each was a sole-maintainer package with massive download volume. The structural signal was visible for years before the attack.

HIGH Either condition:
  • Package age < 1 year AND >1M weekly downloads
  • Single maintainer AND >1M weekly downloads

Elevated structural risk. Either rapid adoption without time-tested stability, or significant credential concentration at meaningful scale.

WARN No publish in >365 days

The package has not published a new version in over one year. This may indicate abandonment, or may indicate a mature, stable library. Context matters. The flag ensures the condition is visible.


04

Repository Scoring: GitHub

When a package links to a GitHub repository, or when a repository is assessed directly, a separate scoring model applies. The dimensions differ from package scoring because the available signals differ.

30 pts

Longevity

Repository agePoints
≥ 5 years30
3–5 years22
1–3 years14
0.5–1 year7
< 6 months2
25 pts

Recent Activity

Commits in the last 30 days. A direct measure of active development.

Commits (30 days)Points
≥ 5025
≥ 2020
≥ 615
≥ 18
00
20 pts

Community

Contributor count. More contributors means distributed knowledge, review capacity, and reduced bus factor.

ContributorsPoints
≥ 2020
≥ 615
≥ 210
15
15 pts

Release Cadence

Stable (non-prerelease) releases among the most recent 10. Regular releases indicate active maintenance and a commitment to shipping.

Stable releasesPoints
≥ 1015
≥ 310
≥ 15
00
10 pts

Social Proof

Stargazer count. An imperfect but broadly available signal of ecosystem recognition. Weighted lowest because stars are gameable and noisy.

StarsPoints
≥ 10,00010
≥ 1,0008
≥ 1005
≥ 102
< 100

Critical Penalty

If a repository is archived or has had no push events in 730+ days, the final score is multiplied by 0.5. This reflects terminal maintenance risk.

Repository Score Composition

30
25
20
15
10
Longevity Activity Community Releases Social

05

Score Tiers

Numerical scores are mapped to four tiers for quick interpretation. These tiers are consistent across all ecosystem types (npm, PyPI, GitHub).

80–100
Strong structural trust

Well-established, actively maintained, multiple maintainers or strong organizational backing. Low structural risk.

60–79
Moderate

Adequate maintenance signals but with structural gaps. Review the specific dimension breakdown before depending on this package.

40–59
Weak

Multiple structural concerns. Consider alternatives or evaluate whether the risk is acceptable for your use case.

0–39
Minimal

High structural risk across multiple dimensions. The package may be abandoned, very new, or unmaintained.

Badge Colors

SVG badges served at /api/badge/:ecosystem/:package use these colors:

CRITICAL < 40 40–59 60–74 75+ Not found

06

Worked Examples

Three packages that illustrate why score and risk flags are independent signals. Data as of April 2026.

axios Score: ~89 CRITICAL
Longevity25/2511+ years old
Momentum25/25~100M weekly downloads, growing
Release20/20100+ versions, recently published
Maintainer4/151 maintainer
GitHub~15/15104k stars, active repo
Why CRITICAL despite high score: The score says "well-established, widely adopted, actively maintained." The flag says "one set of credentials controls infrastructure used by millions of projects." On March 31, 2026, those credentials were compromised in the WAVESHAPER.V2 attack. The structural signal was visible for years.
chalk Score: ~75 CRITICAL
Longevity25/2512+ years old
Momentum25/25~411M weekly downloads
Release~14/20Moderate version count, less recent
Maintainer4/151 maintainer
GitHub~7/1521k stars, lower recent activity
The asymmetry: chalk is downloaded 411 million times per week. One person controls the publish token. This is not a judgment on the maintainer — it's a statement about the ecosystem's exposure. The highest-download single-maintainer package in npm.
express Score: ~97 No flags
Longevity25/2515+ years old
Momentum25/25~93M weekly downloads, growing
Release20/20300+ versions, recently published
Maintainer15/155+ maintainers
GitHub~12/1565k stars, active multi-contributor repo
High score, no flags: express demonstrates what structural trust looks like. Long-lived, actively maintained by a team, with organizational backing from the OpenJS Foundation. No single credential controls the package.

07

Limitations

No scoring model is complete. These are the known limitations of the current methodology. Acknowledging them is a feature, not a caveat.

Weights are informed heuristics, not empirically derived

The dimension weights (25/25/20/15/15) reflect expert judgment about which structural signals correlate most with supply chain risk. They are not the output of a regression model trained on historical attack data. Such a model would require a comprehensive labeled dataset of supply chain compromises that does not exist at sufficient scale. The weights are defensible, not proven.

Code quality is not measured

Commit does not assess test coverage, code complexity, documentation quality, or architectural soundness. A well-structured package with zero tests and a chaotic package with comprehensive coverage receive the same score if their structural signals are identical.

Maintainer identity is not verified

Maintainer count is taken from registry metadata. Whether those accounts represent distinct individuals, or whether they have appropriate access controls (2FA, hardware keys), is not checked. A package with 5 maintainers where all 5 share one person's email alias would score the same as one with 5 independent maintainers.

Thresholds are step functions

A package at 999,999 weekly downloads scores 18 on momentum; at 1,000,000 it scores 22. This discrete jump does not reflect a real-world discontinuity. Continuous scoring functions would be smoother but harder to audit and reproduce. We chose transparency over precision.

No security practice assessment

Whether a package uses signed releases, requires 2FA for publishers, or has a security policy is not measured. These are important signals, but they are declarations (gameable), not behaviors (observable). Future versions may incorporate signals that bridge this gap.

Ecosystem coverage

v1.0.0 covers npm, PyPI, and GitHub repositories. Cargo, Go modules, Maven, and other ecosystems are not yet supported. Each ecosystem requires calibrated download thresholds and registry-specific metadata handling.


08

Versioning & Changelog

This specification follows Semantic Versioning.

Major Breaking changes — weight redistribution, dimension removal or addition, flag threshold changes
Minor New ecosystem support, additional flag types, new dimensions (additive only)
Patch Clarifications, worked examples, editorial corrections

Changelog

v1.0.0 2026-04-19 Initial specification. npm, PyPI, and GitHub scoring. Risk flag definitions.

See it in action

Paste your dependencies. Get structural trust scores for your entire tree.