We Scanned 1 Million Exposed AI Services. Here's How Bad the Security Actually Is

The breakneck pace of AI adoption is creating a new and sprawling attack surface — and a landmark security audit of over one million exposed AI service endpoints reveals the security posture is, by many measures, worse than the early days of cloud computing.

Researchers scanning the internet for publicly accessible self-hosted LLM infrastructure found a pattern of absent authentication, exposed API keys, open model management interfaces, and unprotected inference endpoints across a staggering number of deployments. The findings, published this week in The Hacker News, argue that the software industry's hard-won progress on secure defaults is being eroded by the rush to ship AI capabilities.

What the Scan Found

The research team identified over one million publicly reachable AI service endpoints, including inference APIs, model management dashboards, vector databases, and AI orchestration platforms. Key findings from the audit:

Issue	Scope
Unauthenticated inference APIs	Tens of thousands of open Ollama, LM Studio, and similar endpoints accepting requests from any IP
Exposed model management UIs	Admin dashboards (Hugging Face local, OpenWebUI, Flowise) with no login required
Leaked API keys in responses	Misconfigured reverse proxies forwarding upstream provider keys in response headers
Open vector database instances	Qdrant, Weaviate, and Chroma instances reachable without credentials, containing embeddings of sensitive documents
Unprotected LangChain/LlamaIndex APIs	Orchestration endpoints accepting arbitrary tool calls with no access control

The researchers noted that many of these services were deployed by individual developers or small teams under deadline pressure — the common thread being that security was deprioritized in favor of getting the AI feature working.

Why This Matters Beyond Chatbots

The risk isn't just that someone can make free API calls to a company's self-hosted LLM. The deeper concerns are:

Data Exfiltration via Open Inference Endpoints

Many self-hosted AI deployments are purpose-built for internal document analysis — legal contracts, medical records, financial reports. An open inference endpoint means any attacker who can reach it can query the model with prompts designed to extract training data or retrieve documents the model was fine-tuned on.

Prompt Injection at Scale

Open orchestration APIs (LangChain agents, AutoGPT-style systems, n8n AI nodes) are often wired to internal tools — databases, filesystems, email, Slack. An attacker with access to the inference API can craft prompts that cause the AI agent to exfiltrate data, send messages, or invoke destructive tool calls.

Credential Theft via Header Leakage

Multiple instances were found where misconfigured Nginx or Caddy reverse proxies forwarded the Authorization: Bearer sk-... header from upstream AI providers — effectively handing attacker the organization's OpenAI, Anthropic, or Mistral API keys in the HTTP response.

Supply Chain Risk via Poisoned Models

Open model management interfaces allow arbitrary model uploads. An attacker who can reach an admin dashboard can replace a production model with a malicious one that exfiltrates prompts, injects content into responses, or behaves differently for specific users.

The Root Cause: Secure Defaults Are Broken

The researchers point to a systemic failure in how AI tooling is packaged and documented:

Ollama binds to 0.0.0.0 by default on some platforms, making it publicly accessible if the host has a public IP and no firewall rule
OpenWebUI does not enforce authentication on the API path even when the web UI login is enabled
Many vector databases ship with authentication disabled by default, relying on network-level isolation that developers often do not configure
AI development tutorials routinely skip security setup steps to reduce friction, normalizing insecure configurations

This mirrors the early days of MongoDB and Elasticsearch, where thousands of unprotected databases were exposed to the internet because the default configuration assumed a trusted network — an assumption that rarely held in practice.

How Attackers Are Exploiting This

The research noted evidence of active exploitation, including:

Cryptomining campaigns using free inference capacity on open Ollama endpoints to run GPU-accelerated workloads
Data scraping operations extracting business-sensitive context from RAG-enabled AI systems
Reconnaissance tooling that identifies and catalogs open AI endpoints for later targeting

Attack tooling for enumerating and exploiting open AI services is now circulating on underground forums, lowering the barrier for less sophisticated actors.

Recommendations for Organizations Deploying Self-Hosted AI

Authentication Is Non-Negotiable

Every AI inference endpoint, model management API, and vector database must require authentication — regardless of whether it is "internal only." Network perimeter assumptions are unreliable.

# Nginx basic auth for Ollama
location /api/ {
    auth_basic "AI Services";
    auth_basic_user_file /etc/nginx/.htpasswd;
    proxy_pass http://localhost:11434;
}

Bind to Localhost, Not 0.0.0.0

Self-hosted AI services should bind to 127.0.0.1 by default, with a reverse proxy handling external access. Review OLLAMA_HOST, CHROMA_HOST, and similar environment variables in your deployment.

Audit Your Exposure

Use tools like Shodan, Censys, or your cloud provider's internet-exposure scanner to identify any AI service ports reachable from the internet:

# Check what's listening on common AI ports
ss -tlnp | grep -E "11434|8080|6333|8000|3000|7860"

Rotate Any Exposed Keys

If your reverse proxy or API gateway forwards upstream AI provider credentials, audit your configuration and rotate any keys that may have been exposed.

Treat AI Infrastructure as Production Systems

Developer laptops with Ollama running are not production systems. AI inference infrastructure that touches business data must be subject to the same security controls as any other production service: authentication, network segmentation, logging, and patch management.

References

The Hacker News — We Scanned 1 Million Exposed AI Services

What the Scan Found

Issue	Scope
Unauthenticated inference APIs	Tens of thousands of open Ollama, LM Studio, and similar endpoints accepting requests from any IP
Exposed model management UIs	Admin dashboards (Hugging Face local, OpenWebUI, Flowise) with no login required
Leaked API keys in responses	Misconfigured reverse proxies forwarding upstream provider keys in response headers
Open vector database instances	Qdrant, Weaviate, and Chroma instances reachable without credentials, containing embeddings of sensitive documents
Unprotected LangChain/LlamaIndex APIs	Orchestration endpoints accepting arbitrary tool calls with no access control

Why This Matters Beyond Chatbots

The risk isn't just that someone can make free API calls to a company's self-hosted LLM. The deeper concerns are:

Data Exfiltration via Open Inference Endpoints

Prompt Injection at Scale

Credential Theft via Header Leakage

Supply Chain Risk via Poisoned Models

The Root Cause: Secure Defaults Are Broken

The researchers point to a systemic failure in how AI tooling is packaged and documented:

Ollama binds to 0.0.0.0 by default on some platforms, making it publicly accessible if the host has a public IP and no firewall rule
OpenWebUI does not enforce authentication on the API path even when the web UI login is enabled
Many vector databases ship with authentication disabled by default, relying on network-level isolation that developers often do not configure
AI development tutorials routinely skip security setup steps to reduce friction, normalizing insecure configurations

How Attackers Are Exploiting This

The research noted evidence of active exploitation, including:

Cryptomining campaigns using free inference capacity on open Ollama endpoints to run GPU-accelerated workloads
Data scraping operations extracting business-sensitive context from RAG-enabled AI systems
Reconnaissance tooling that identifies and catalogs open AI endpoints for later targeting

Attack tooling for enumerating and exploiting open AI services is now circulating on underground forums, lowering the barrier for less sophisticated actors.

Recommendations for Organizations Deploying Self-Hosted AI

Authentication Is Non-Negotiable

Every AI inference endpoint, model management API, and vector database must require authentication — regardless of whether it is "internal only." Network perimeter assumptions are unreliable.

# Nginx basic auth for Ollama
location /api/ {
    auth_basic "AI Services";
    auth_basic_user_file /etc/nginx/.htpasswd;
    proxy_pass http://localhost:11434;
}

Bind to Localhost, Not 0.0.0.0

Audit Your Exposure

Use tools like Shodan, Censys, or your cloud provider's internet-exposure scanner to identify any AI service ports reachable from the internet:

# Check what's listening on common AI ports
ss -tlnp | grep -E "11434|8080|6333|8000|3000|7860"

Rotate Any Exposed Keys

If your reverse proxy or API gateway forwards upstream AI provider credentials, audit your configuration and rotate any keys that may have been exposed.

Treat AI Infrastructure as Production Systems

References

The Hacker News — We Scanned 1 Million Exposed AI Services

We Scanned 1 Million Exposed AI Services. Here's How Bad the Security Actually Is

What the Scan Found

Why This Matters Beyond Chatbots

Data Exfiltration via Open Inference Endpoints

Prompt Injection at Scale

Credential Theft via Header Leakage

Supply Chain Risk via Poisoned Models

The Root Cause: Secure Defaults Are Broken

How Attackers Are Exploiting This

Recommendations for Organizations Deploying Self-Hosted AI

Authentication Is Non-Negotiable

Bind to Localhost, Not 0.0.0.0

Audit Your Exposure

Rotate Any Exposed Keys

Treat AI Infrastructure as Production Systems

References

2026: The Year AI Became the Attacker's Favorite Co-Pilot

Weekly Recap: AI-Powered Phishing, Android Spying Tool, Linux Exploit, GitHub RCE

Why Data Centers Now Belong on the Critical Infrastructure List

We Scanned 1 Million Exposed AI Services. Here's How Bad the Security Actually Is

What the Scan Found

Why This Matters Beyond Chatbots

Data Exfiltration via Open Inference Endpoints

Prompt Injection at Scale

Credential Theft via Header Leakage

Supply Chain Risk via Poisoned Models

The Root Cause: Secure Defaults Are Broken

How Attackers Are Exploiting This

Recommendations for Organizations Deploying Self-Hosted AI

Authentication Is Non-Negotiable

Bind to Localhost, Not 0.0.0.0

Audit Your Exposure

Rotate Any Exposed Keys

Treat AI Infrastructure as Production Systems

References

2026: The Year AI Became the Attacker's Favorite Co-Pilot

Weekly Recap: AI-Powered Phishing, Android Spying Tool, Linux Exploit, GitHub RCE

Why Data Centers Now Belong on the Critical Infrastructure List