Skip to main content
COSMICBYTEZLABS
NewsSecurityHOWTOsToolsStudyTraining
ProjectsChecklistsAI RankingsNewsletterStatusTagsAbout
Subscribe

Press Enter to search or Esc to close

News
Security
HOWTOs
Tools
Study
Training
Projects
Checklists
AI Rankings
Newsletter
Status
Tags
About
RSS Feed
Reading List
Subscribe

Stay in the Loop

Get the latest security alerts, tutorials, and tech insights delivered to your inbox.

Subscribe NowFree forever. No spam.
COSMICBYTEZLABS

Your trusted source for IT intelligence, cybersecurity insights, and hands-on technical guides.

925+ Articles
122+ Guides

CONTENT

  • Latest News
  • Security Alerts
  • HOWTOs
  • Projects
  • Exam Prep

RESOURCES

  • Search
  • Browse Tags
  • Newsletter Archive
  • Reading List
  • RSS Feed

COMPANY

  • About Us
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 CosmicBytez Labs. All rights reserved.

System Status: Operational
  1. Home
  2. Security
  3. CVE-2026-7482: Critical Heap Out-of-Bounds Read in Ollama GGUF Model Loader
CVE-2026-7482: Critical Heap Out-of-Bounds Read in Ollama GGUF Model Loader

Critical Security Alert

This vulnerability is actively being exploited. Immediate action is recommended.

SECURITYCRITICALCVE-2026-7482

CVE-2026-7482: Critical Heap Out-of-Bounds Read in Ollama GGUF Model Loader

A critical CVSS 9.1 heap out-of-bounds read vulnerability in Ollama's GGUF model loader allows attackers to supply a crafted model file via the /api/create endpoint with mismatched tensor offsets, triggering memory corruption that can lead to information disclosure or denial of service in Ollama before 0.17.1.

Dylan H.

Security Team

May 5, 2026
7 min read

Affected Products

  • Ollama before 0.17.1

Executive Summary

A critical heap out-of-bounds read vulnerability (CVE-2026-7482) has been disclosed in Ollama, the widely used open-source platform for running large language models (LLMs) locally. The vulnerability carries a CVSS score of 9.1 (Critical) and affects Ollama versions before 0.17.1.

CVSS Score: 9.1 (Critical) Attack Vector: Network Authentication Required: Dependent on deployment configuration

The flaw resides in Ollama's GGUF model file loader (fs/ggml/gguf.go) and is triggered via the /api/create endpoint. An attacker can supply a specially crafted GGUF file in which the declared tensor offset and size exceed the file's actual length. During quantization processing, the loader reads beyond the allocated heap buffer, enabling potential information disclosure (leaking heap memory contents), process crash (denial of service), or under specific conditions, a path toward code execution depending on heap layout.


Vulnerability Overview

AttributeValue
CVE IDCVE-2026-7482
CVSS Score9.1 (Critical)
CWECWE-125 — Out-of-Bounds Read
TypeHeap Out-of-Bounds Read
Attack VectorNetwork
Attack ComplexityLow
Privileges RequiredNone (on default/unauthenticated deployments)
User InteractionNone
Affected ComponentGGUF model loader — fs/ggml/gguf.go, server/quantization
Trigger Endpoint/api/create
Fixed VersionOllama 0.17.1
Published2026-05-04

Affected Products

ProductAffected VersionsFixed Version
OllamaAll versions before 0.17.10.17.1

Ollama is the most widely adopted platform for running models like Llama 3, Mistral, Phi, Gemma, and other GGUF-format models locally. It is deployed across developer workstations, internal inference servers, AI application backends, and cloud-hosted instances — including many with the API exposed on public or semi-public network interfaces.


Technical Details

What Is GGUF?

GGUF (GPT-Generated Unified Format) is the binary file format used by llama.cpp and the broader LLM ecosystem to distribute quantized model weights. A GGUF file contains a header with metadata followed by tensor data blocks. Each tensor is described by a name, type, dimensions, an offset into the file, and a data size.

The Vulnerability

The Ollama GGUF loader in fs/ggml/gguf.go reads tensor metadata (offset + size) from the GGUF file header and uses those values to locate tensor data within a memory-mapped file buffer. The loader does not validate that tensor_offset + tensor_size <= file_size before performing the read operation.

When a malicious GGUF file is crafted with a tensor offset or size that exceeds the file's actual length:

Malicious GGUF tensor descriptor:
  tensor_name:   "malicious_tensor"
  tensor_offset: 0x1000          # Valid offset within file
  tensor_size:   0x99999999      # FAR exceeds actual file size
 
During quantization in server/quantization:
  src_ptr = mmap_base + tensor_offset          # Points within file mapping
  memcpy(dst, src_ptr, tensor_size)            # Reads PAST end of file mapping
                                                # → Heap out-of-bounds read

The read operation goes beyond the mapped memory region, potentially reading adjacent heap allocations. This can:

  1. Leak heap memory contents — including other model data, API keys stored in memory, or internal process state
  2. Crash the Ollama process — if the read reaches unmapped memory (SIGSEGV)
  3. Enable information disclosure if the API returns processed quantization output that includes leaked heap bytes

Deployment Risk Amplifier

Many Ollama deployments expose the /api/create endpoint on 0.0.0.0:11434 with no authentication by default. This is intentional for local developer use but becomes a critical exposure when Ollama is deployed on cloud VMs, Docker containers with exposed ports, or internal servers accessible by untrusted parties. A single crafted HTTP POST with a malicious GGUF file is sufficient to trigger the vulnerability with no prior authentication.

# Example trigger (proof-of-concept, not weaponized)
curl -X POST http://[ollama-server]:11434/api/create \
  -H "Content-Type: application/json" \
  -d '{"name": "exploit", "modelfile": "FROM /path/to/malicious.gguf"}'

Impact Assessment

Impact AreaDescription
Information DisclosureHeap memory contents leaked — may include API keys, model metadata, or runtime state
Denial of ServiceOllama process crash from SIGSEGV on unmapped read — disrupts AI inference services
Memory Corruption PathDepending on heap layout, heap OOB reads can escalate to write conditions in some exploit chains
Model Poisoning VectorAn attacker who can load arbitrary GGUF files can potentially influence model behavior
Backend API ExposureIf Ollama is a backend for an AI application, disrupting it can cascade to dependent services

Recommendations

Immediate Actions

  1. Update Ollama to version 0.17.1 or later — this is the only complete fix:
    # Linux/macOS update
    curl -fsSL https://ollama.com/install.sh | sh
     
    # Or via package manager if installed that way
    brew upgrade ollama   # macOS with Homebrew
  2. Verify the running version:
    ollama --version
    # Should report 0.17.1 or higher

Defense-in-Depth (Even After Patching)

  1. Restrict /api/create access — if model creation from external inputs is not required, block this endpoint at the network layer or via a reverse proxy:
    location /api/create {
        allow 127.0.0.1;
        deny all;
    }
  2. Enable Ollama authentication — configure OLLAMA_ORIGINS and use API key authentication for all non-localhost access
  3. Isolate Ollama from the internet — ensure the Ollama port (default 11434) is not exposed on WAN interfaces:
    # Bind Ollama to localhost only
    OLLAMA_HOST=127.0.0.1 ollama serve
  4. Validate GGUF files before loading — implement a model integrity verification step before passing files to Ollama's /api/create

For AI Application Developers

If you are building applications on top of Ollama:

  • Never expose the Ollama API directly to end users — proxy requests through your application layer
  • Validate and sanitize any user-supplied model paths or GGUF file inputs before forwarding to Ollama
  • Implement rate limiting on model creation endpoints

Detection Indicators

IndicatorDescription
Ollama process crash (SIGSEGV)Failed exploitation attempt or denial of service
Unexpected /api/create calls from external IPsExploitation attempts
Abnormally large GGUF files in model creation requestsPotential malicious payload delivery
Unusual memory usage spikes on Ollama processPossible exploitation in progress
AI inference service unavailabilitySuccessful DoS via crash

Broader Context: AI Infrastructure Security

CVE-2026-7482 is part of a growing pattern of vulnerabilities in AI model serving infrastructure. As organizations deploy local and cloud-hosted LLM servers, the attack surface expands significantly. Related recent vulnerabilities include:

  • CVE-2026-5760 (SGLang CVSS 9.8) — RCE via malicious GGUF model files
  • LMDeploy CVE-2026-33626 — Exploited within 13 hours of disclosure
  • Gemini CLI RCE — Code execution via compromised model interactions

The pattern is clear: AI model files are an emerging attack vector. Treat GGUF files from untrusted sources with the same caution as executable code.


Post-Remediation Checklist

  1. Confirm Ollama 0.17.1+ is running on all deployment nodes
  2. Audit exposed API endpoints — verify /api/create is not internet-accessible
  3. Review model creation logs for suspicious activity prior to patching
  4. Rotate any credentials or API keys that were in memory during the exposure window
  5. Implement GGUF file validation in your model deployment pipeline going forward
  6. Monitor Ollama process health and set up alerting on unexpected crashes

References

  • NIST NVD — CVE-2026-7482
  • Ollama GitHub Repository
  • Ollama Security Advisories
  • CWE-125: Out-of-Bounds Read
  • GGUF File Format Specification
#AI Security#CVE-2026-7482#Ollama#GGUF#Heap Overflow#LLM Infrastructure#Supply Chain

Related Articles

Apache MINA Incomplete Deserialization Patch Leaves 2.1.X and 2.2.X Branches Vulnerable

Apache MINA versions 2.1.X and 2.2.X remain vulnerable to unauthenticated remote code execution because the fix for CVE-2026-41409 was never backported,...

6 min read

CVE-2026-30352: Remote Code Execution in leonvanzyl Autocoder via /devserver/start Command Injection (CVSS 9.8)

A critical remote code execution vulnerability in the /devserver/start endpoint of the leonvanzyl autocoder AI coding tool allows unauthenticated...

6 min read

CVE-2026-6951: simple-git RCE via --config Option Bypass (CVSS 9.8)

A critical remote code execution vulnerability in the simple-git npm package allows attackers to inject arbitrary git config options via the --config...

6 min read
Back to all Security Alerts