Is Ollama secure by default?

No. Default Ollama installations bind to 0.0.0.0:11434 with no authentication, exposing your LLM to the entire local network. You must bind to localhost and add a reverse proxy with authentication before any production use.

How do I add authentication to a self-hosted LLM?

Use a reverse proxy (nginx or Caddy) in front of your LLM gateway. Add API key validation or mTLS client certificate authentication. LiteLLM Proxy provides a production-ready solution with built-in auth, rate limiting, and spend tracking.

What is LiteLLM and why use it for LLM gateway security?

LiteLLM Proxy is an open-source unified LLM gateway that adds authentication, rate limiting, budget controls, and audit logging in front of any LLM provider. It's the fastest way to get a production-hardened LLM gateway for Moltbot deployments.

"Not a Pentest" Notice: Dieser Guide dient zur Härtung eigener LLM-Infrastruktur. Kein Angriffswerkzeug.

Moltbot AI Security · Self-Hosted LLM Guide

LLM Gateway Härtung — Dein Ollama-Port ist gerade offen. Hier ist der Fix.

Standard-Ollama-Installationen exponieren Port 11434 auf allen Interfaces ohne jede Authentifizierung. Wenn dein LLM-Gateway im lokalen Netzwerk erreichbar ist, kann jedes Gerät darauf zugreifen. Dieser Guide schließt jede Lücke.

Was ist ein LLM Gateway und warum ist es kritisch? Einfach erklärt

Stell dir vor, du hostest ein KI-Modell lokal auf deinem Server. Der LLM-Gateway ist die Tür zu diesem Modell. Ohne Schloss (Authentifizierung) kann jeder im Netzwerk eintreten — Kollegen, Besucher im Büro-WLAN, oder Angreifer. Ollama, LocalAI und LiteLLM sind beliebte Self-Hosted-Lösungen, die standardmäßig keine Authentifizierung haben. Ein offener LLM-Gateway kostet dich GPU-Ressourcen, kann deine Prompts durchleaken, und ermöglicht Modell-Extraktion.

↓ Springe direkt zur technischen Tiefe unten

🚨 Default State Is Dangerous

Running ollama serve out of the box listens on 0.0.0.0:11434 with no auth. Anyone who can reach your machine can:

• Query your models for free (at your GPU cost)
• Extract model behavior systematically
• Inject malicious prompts into your pipeline
• Read your chat history if not isolated per-user

Risk Assessment: Default LLM Gateway

CRITICAL

Unauthenticated API

Default Ollama/LocalAI installations expose port 11434 with no authentication. Anyone on the network can query your models.

HIGH

No Rate Limiting

Unlimited requests drain GPU resources, run up cloud bills, and enable model extraction attacks.

HIGH

Plain HTTP

LLM traffic (including prompt contents) transmitted in plaintext — interceptable on local network or by co-tenant in cloud.

HIGH

No Audit Logging

Zero visibility into who queried what, when, with what prompts. Forensically blind.

MEDIUM

Wide Network Access

LLM gateway accessible from all subnets instead of only application services that need it.

Hardening Steps

1. Bind to localhost, not 0.0.0.0

# Ollama
OLLAMA_HOST=127.0.0.1:11434 ollama serve

# LocalAI config.yaml
address: "127.0.0.1:8080"  # NOT 0.0.0.0

# Verify — should ONLY show 127.0.0.1
ss -tlnp | grep 11434

2. Reverse proxy with authentication

# nginx config for LLM gateway
server {
  listen 443 ssl;
  server_name llm.internal.example.com;
  
  # mTLS — only trusted clients
  ssl_client_certificate /etc/nginx/certs/internal-ca.crt;
  ssl_verify_client on;
  
  # OR API key auth
  location / {
    auth_request /validate-key;
    proxy_pass http://127.0.0.1:11434;
  }
  
  location = /validate-key {
    internal;
    proxy_pass http://127.0.0.1:8081/validate;
  }
}

3. Rate limiting per API key

# nginx rate limiting
limit_req_zone $http_x_api_key zone=llm_per_key:10m rate=10r/m;
limit_req_zone $binary_remote_addr zone=llm_per_ip:10m rate=30r/m;

location /api/ {
  limit_req zone=llm_per_key burst=5 nodelay;
  limit_req zone=llm_per_ip burst=10 nodelay;
  proxy_pass http://127.0.0.1:11434;
}

4. Audit logging via proxy

// LLM gateway audit middleware (Node.js/Express)
app.use('/api', async (req, res, next) => {
  const start = Date.now()
  const logEntry = {
    ts: new Date().toISOString(),
    apiKey: hashKey(req.headers['x-api-key']),
    ip: req.ip,
    model: req.body?.model,
    promptHash: sha256(req.body?.prompt ?? ''),
    promptLength: (req.body?.prompt ?? '').length,
  }
  res.on('finish', () => {
    logEntry.duration = Date.now() - start
    logEntry.status = res.statusCode
    auditLog.write(logEntry)
  })
  next()
})

5. Network isolation with iptables/nftables

# Allow only app server to reach LLM gateway
iptables -A INPUT -p tcp --dport 11434 -s 10.0.1.5 -j ACCEPT   # app server IP
iptables -A INPUT -p tcp --dport 11434 -j DROP                    # block all others

# Verify
iptables -L INPUT -n -v | grep 11434

LiteLLM Proxy as Secure Gateway

LiteLLM Proxy provides a hardened, unified gateway for multiple LLM providers with built-in auth, rate limiting, and spend tracking:

# litellm_config.yaml
model_list:
  - model_name: "moltbot-llm"
    litellm_params:
      model: "ollama/mistral"
      api_base: "http://127.0.0.1:11434"

general_settings:
  master_key: "sk-${LITELLM_MASTER_KEY}"  # from env var
  
litellm_settings:
  max_budget: 100          # USD spend limit
  budget_duration: "1mo"
  success_callback: ["langfuse"]  # audit trail
  
router_settings:
  routing_strategy: "usage-based-routing"
  num_retries: 3

Further Resources

Prompt Injection Defense

Protect against LLM input attacks

Reverse Proxy Security

Nginx/Caddy hardening guide

Stack MRI

Scan your LLM stack for open ports

AI Agent Security Hub

OWASP LLM Top 10 — full defense map

ClawGuru Security Team

✓ Verified

Security Research & Engineering · Self-Hosted AI Specialists

📅 Veröffentlicht: 27.04.2026🔄 Zuletzt geprüft: 27.04.2026

Dieser Guide basiert auf praktischer Erfahrung mit Self-Hosted-LLM-Deployments. Wir haben dutzende Ollama- und LiteLLM-Instanzen produktionsbereit gehärtet. Ein offener LLM-Gateway ist das häufigste Sicherheitsproblem in AI-Infrastrukturen.

🔒 Verifiziert von ClawGuru Security Team·Alle Informationen fact-checked und peer-reviewed