Skip to main content
COSMICBYTEZLABS
NewsSecurityHOWTOsToolsStudyTraining
ProjectsChecklistsAI RankingsNewsletterStatusTagsAbout
Subscribe

Press Enter to search or Esc to close

News
Security
HOWTOs
Tools
Study
Training
Projects
Checklists
AI Rankings
Newsletter
Status
Tags
About
RSS Feed
Reading List
Subscribe

Stay in the Loop

Get the latest security alerts, tutorials, and tech insights delivered to your inbox.

Subscribe NowFree forever. No spam.
COSMICBYTEZLABS

Your trusted source for IT intelligence, cybersecurity insights, and hands-on technical guides.

804+ Articles
120+ Guides

CONTENT

  • Latest News
  • Security Alerts
  • HOWTOs
  • Projects
  • Exam Prep

RESOURCES

  • Search
  • Browse Tags
  • Newsletter Archive
  • Reading List
  • RSS Feed

COMPANY

  • About Us
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 CosmicBytez Labs. All rights reserved.

System Status: Operational
  1. Home
  2. News
  3. LMDeploy CVE-2026-33626 Flaw Exploited Within 13 Hours of Disclosure
LMDeploy CVE-2026-33626 Flaw Exploited Within 13 Hours of Disclosure
NEWS

LMDeploy CVE-2026-33626 Flaw Exploited Within 13 Hours of Disclosure

A high-severity SSRF vulnerability in LMDeploy, a widely used open-source LLM deployment toolkit, was actively exploited in the wild less than 13 hours after its public disclosure, demonstrating the accelerating window between patch release and weaponization.

Dylan H.

News Desk

April 26, 2026
6 min read

A high-severity security flaw in LMDeploy, an open-source toolkit used to compress, deploy, and serve large language models, came under active exploitation in the wild less than 13 hours after its public disclosure. The vulnerability, tracked as CVE-2026-33626 with a CVSS score of 7.5, is a Server-Side Request Forgery (SSRF) flaw that enables attackers to make the server issue arbitrary network requests on their behalf.

The speed of exploitation — from disclosure to weaponization in under half a day — reflects a trend security researchers have been warning about throughout 2026: the window between vulnerability disclosure and active attacks has collapsed for publicly available open-source AI infrastructure tools.

What Is LMDeploy

LMDeploy is an open-source project maintained by Shanghai AI Laboratory (InternLM) and used extensively to deploy LLM inference services. It provides:

  • Model compression and quantization tools (AWQ, SmoothQuant)
  • High-throughput inference serving with a turbomind engine
  • Support for popular model families including LLaMA, Mistral, Qwen, and InternLM
  • REST API server compatible with OpenAI's chat completions format

LMDeploy is widely used in both research environments and production AI deployments, making it an attractive target for attackers seeking access to AI infrastructure or the sensitive data it processes.

The Vulnerability: CVE-2026-33626

AttributeValue
CVE IDCVE-2026-33626
CVSS Score7.5 (High)
Vulnerability TypeServer-Side Request Forgery (SSRF)
Attack VectorNetwork
Authentication RequiredNone
Exploitation StatusActively exploited in the wild
Disclosure DateApril 24, 2026
Time to ExploitationUnder 13 hours

The SSRF flaw allows an unauthenticated remote attacker to manipulate LMDeploy's API endpoints into making HTTP requests to arbitrary network destinations — including internal services, cloud metadata endpoints, and other infrastructure components not intended to be externally accessible.

How SSRF Works in This Context

Server-Side Request Forgery exploits occur when an application fetches a remote resource based on user-supplied input without properly validating or restricting the destination. In an LLM deployment context, this is particularly dangerous because:

  1. Cloud metadata APIs — Attackers can use the SSRF to query http://169.254.169.254/latest/meta-data/ (AWS instance metadata) or equivalent endpoints on GCP and Azure, potentially retrieving instance credentials and IAM tokens
  2. Internal network pivoting — LMDeploy servers often sit on internal networks with access to databases, model registries, and other services; SSRF enables access to these without direct network exposure
  3. Credential exfiltration — Cloud credentials obtained via metadata SSRF can be used to escalate into the broader cloud environment
  4. Model data access — Internal storage systems containing proprietary model weights or training data may be reachable

A typical exploitation chain looks like:

1. Attacker sends crafted API request to exposed LMDeploy endpoint
2. Server fetches attacker-specified URL (e.g., http://169.254.169.254/...)
3. Cloud metadata credentials returned to attacker in response
4. Attacker uses obtained credentials for broader cloud environment access

Exploitation Under 13 Hours

According to threat intelligence reports, the first exploitation attempts against CVE-2026-33626 were observed less than 13 hours after The Hacker News published the vulnerability details on April 24, 2026. This timeline matches a pattern seen with other high-profile AI infrastructure vulnerabilities in 2026:

  • Langflow CVE-2026-33017 — exploited within 20 hours of disclosure
  • SGLang CVE-2026-5760 — weaponized within 48 hours
  • LMDeploy CVE-2026-33626 — exploited within 13 hours

The rapid exploitation is attributed to automated scanning infrastructure maintained by threat actors that continuously monitors vulnerability disclosures and deploys exploitation code within hours of proof-of-concept publication.

Exposure Surface

LMDeploy deployments frequently expose their inference APIs on accessible ports (default: 23333) to serve model requests from applications and users. Misconfigured or development deployments may expose these endpoints to the internet without authentication, dramatically increasing attack surface.

Shodan and Censys scans have historically identified thousands of openly accessible LLM inference endpoints — including LMDeploy, Ollama, and similar tools — with no authentication controls.

Impact on AI Infrastructure

The consequences of successful SSRF exploitation against an LMDeploy server extend well beyond the immediate server:

ThreatDescription
Cloud credential theftIMDSv1 metadata API yields AWS/GCP/Azure credentials without authentication
Model IP theftAccess to internal model registries or storage buckets containing proprietary weights
Training data exposureDatasets stored in accessible internal services
Lateral movementInternal network access enables attacks on databases, registries, and other services
Supply chain riskCompromised inference API can serve modified outputs to downstream applications
Sensitive data leakageInference logs and prompt histories may contain sensitive user queries

Mitigation and Remediation

Immediate Actions

1. Update LMDeploy immediately. Apply the patched version as soon as it is available. Check the official repository for the patched release addressing CVE-2026-33626.

2. Restrict network exposure. LMDeploy's inference API should never be directly exposed to the internet without an authentication proxy:

# Bind LMDeploy to localhost only — proxy through nginx with auth
lmdeploy serve api_server ./model --server-port 23333 --server-name 127.0.0.1

3. Block SSRF paths at the network layer. Implement egress filtering that prevents the LMDeploy server process from reaching internal metadata endpoints:

# Block AWS IMDSv1 metadata access from application servers
iptables -I OUTPUT -d 169.254.169.254 -j REJECT
 
# For cloud deployments, enforce IMDSv2 (token-required) and disable IMDSv1
aws ec2 modify-instance-metadata-options \
  --instance-id i-xxxx \
  --http-tokens required \
  --http-endpoint enabled

4. Implement authentication. Place an authenticated reverse proxy in front of any LMDeploy API endpoint:

location /v1/ {
    auth_basic "LMDeploy API";
    auth_basic_user_file /etc/nginx/.htpasswd;
    proxy_pass http://127.0.0.1:23333;
}

5. Audit for compromise indicators. Review server access logs for suspicious outbound requests or unusual API call patterns that could indicate SSRF exploitation.

Longer-Term Hardening

  • Deploy LMDeploy in an isolated network segment with no route to internal infrastructure
  • Use container runtime security tools (Falco, Tracee) to monitor for unexpected outbound connections
  • Rotate all cloud credentials and API keys accessible from the affected server
  • Enable detailed request logging to detect SSRF-pattern payloads

The Accelerating Exploitation Window

CVE-2026-33626 is the latest example of what security researchers are calling the "collapsing N-day window" — the time between vulnerability disclosure and active exploitation in the wild. For AI infrastructure tools with large deployment bases, this window has fallen below 24 hours in multiple documented cases in 2026.

The implication for defenders is stark: patch management windows measured in days or weeks are no longer sufficient for critical AI infrastructure vulnerabilities. Organizations running LLM serving infrastructure must implement automated vulnerability scanning that flags newly disclosed CVEs within hours, and maintain pre-validated emergency patching procedures for AI tooling.

Sources

  • The Hacker News — LMDeploy CVE-2026-33626 Flaw Exploited Within 13 Hours of Disclosure
  • NVD — CVE-2026-33626
#LMDeploy#CVE-2026-33626#SSRF#AI Security#Vulnerability#Active Exploitation#The Hacker News

Related Articles

Anthropic MCP Design Vulnerability Enables RCE, Threatening AI Supply Chain

Cybersecurity researchers have discovered a critical by-design weakness in the Model Context Protocol architecture that enables arbitrary command...

5 min read

SGLang CVE-2026-5760 (CVSS 9.8) Enables RCE via Malicious GGUF Model Files

A critical CVSS 9.8 command injection vulnerability in the SGLang AI inference framework allows attackers to achieve remote code execution by supplying a...

4 min read

CISA: New Langflow Flaw Actively Exploited to Hijack AI Workflows

CISA has added CVE-2026-33017, a critical unauthenticated remote code execution vulnerability in the Langflow AI framework, to its Known Exploited...

5 min read
Back to all News