Zum Hauptinhalt springen
LIVE Intel Feed
AI Tool Use Security · Production-Ready Guide

AI Tool Use Security — Dein AI-Agent hat ungesicherte Tools. Shell-Befehle, HTTP-Requests, File-Write. Prompt Injection → RCE, SSRF, Data Exfiltration. Dein CEO hat den CISO gefeuert.

Dein AI-Agent hat keine Tool-Security, keine Scope-Restriktion und kein HITL. Shell-Befehle ohne Sandbox, HTTP ohne Allowlist, File-Write ohne Confirmation. 48h Incident-Response, Daten-Exfiltration, dein CEO hat den CISO gefeuert. Hier ist, wie du das verhinderst.

"Not a Pentest" Notice: Sicherheitsleitfaden für eigene AI-Agent Tools. Keine Angriffstools.
7
Tool-Risiko-Kategorien
2
CRITICAL-Tool-Typen
HITL
Erforderlich für Write-Tools
0
Vertrauenswürdige Tool-Outputs

Was ist Tool Use Security? Einfach erklärt.

Stell dir Tool Use Security wie die Sicherheit von Werkzeugen vor: Wenn ein LLM Tools aufrufen kann — Shell-Befehle, HTTP-Requests, Datenbank-Abfragen — explodiert die Angriffsfläche. Prompt Injection kann durch ungesicherte Tools zum Host, internen Netzwerk oder sensiblen Daten pivotieren. Gute Tool Use Security bedeutet: Least Tool Principle, Sandbox, HITL.

↓ Springe direkt zur technischen Tiefe

7 Tool Risk Kategorien

ToolRisikoAngriffsvektorVerteidigung
Shell / Code ExecutionCRITICALPrompt injection → arbitrary command execution on hostRun in --read-only container with --cap-drop=ALL. Allowlist permitted commands. 30s hard timeout. Never run as root.
HTTP / Web RequestsHIGHSSRF → internal network access, metadata endpoint, cloud credentialsAllowlist permitted domains/IPs. Block RFC-1918 ranges and link-local (169.254.x.x). Validate URLs before fetch. Log all requests.
File System ReadHIGHPath traversal → read /etc/passwd, ~/.ssh/id_rsa, .env filesRestrict to declared workspace directory. Validate resolved path against workspace root. Block symlink traversal.
File System WriteCRITICALOverwrite config files, inject malicious code, modify agent behaviorRequire human confirmation for all writes. Scope to temp directory only. Audit all write operations.
Database QueriesHIGHSQL injection via LLM-generated queries, data exfiltrationUse parameterized queries only — never string-interpolated SQL. Read-only credentials for read operations. Scope to minimal required tables.
Email / NotificationsHIGHData exfiltration via email, spam/phishing via LLM-drafted contentRequire human approval for all external sends. Allowlist recipients. Content review before send. Rate limit: max 10 emails/hour.
Calendar / SchedulingMEDIUMUnwanted calendar events, social engineering via agent-created meetingsHuman-in-the-loop for all external calendar invites. Scope to own calendar only by default.

Principle of Least Tool

Starte mit null Tools. Füge nur das zurück, was die spezifische Aufgabe benötigt. Ein Summarization-Agent benötigt gar keine Tools. Ein Research-Agent benötigt nur HTTP Read. Ein Coding-Agent benötigt nur File Read + Write in einem scoped Temp-Verzeichnis.

# BAD: register all tools "just in case"
agent = Agent(tools=[ShellTool(), FileTool(), HTTPTool(),
                     EmailTool(), DBTool(), CalendarTool()])

# GOOD: minimum required for the specific task
summarizer = Agent(tools=[])  # No tools needed
researcher = Agent(tools=[HTTPTool(allowlist=["arxiv.org", "pubmed.ncbi.nlm.nih.gov"])])
coder = Agent(tools=[
  FileTool(workspace="/tmp/agent-sandbox", mode="rw"),
  # Shell removed — use isolated subprocess instead
])

Real-World Scars: Production Incidents

SCAR #1: Shell-Tool ohne SandboxCRITICAL

Shell-Tool ohne Sandbox. Prompt Injection → RCE auf Host, Daten-Exfiltration. Fix: Container mit --cap-drop=ALL, Allowlist, Timeout.

Root Cause: Kein Sandbox für Shell-Tool. Lessons: Aktiviere --read-only Container mit --cap-drop=ALL.
SCAR #2: HTTP-Tool ohne AllowlistHIGH

HTTP-Tool ohne Allowlist. SSRF → internes Netzwerk, Metadata-Endpoint, Cloud-Credentials. Fix: Domain-Allowlist, RFC-1918 Block.

Root Cause: Kein Allowlist für HTTP-Tool. Lessons: Aktiviere Domain-Allowlist mit RFC-1918 Block.

Sofortmaßnahmen: Was heute tun?

1

Tool-Audit durchführen

Liste alle Tools, klassifiziere nach Risiko, entferne unnötige Tools.

2

Sandbox für gefährliche Tools

Isoliere Shell/Code-Tools in Container mit --cap-drop=ALL.

3

HITL für CRITICAL Tools

Human-in-the-Loop für Shell, File-Write, Email-Tools.

Interaktive Tool Use Checkliste

Tool Use Maturity Score Calculator

Hast du Tool-Audit durchgeführt?
Ist Sandbox für Shell-Tools aktiv?
Ist HITL für CRITICAL Tools aktiv?
Ist Tool-Output Sanitization aktiv?
Dein Tool Use Maturity Score:0/100

Industrie-Durchschnitt: 16/100

Häufige Fragen

What is the biggest security risk of LLM function calling?

Unscoped tool access combined with prompt injection. An LLM with access to a shell tool and no sandboxing can be prompted to execute arbitrary commands. The fix: every tool must have a declared scope, run in an isolated container, and dangerous tools (shell, file write, HTTP) require human confirmation or are restricted to an allowlist.

How do I implement human-in-the-loop for AI tool use?

For high-risk tools: before execution, present the proposed tool call (tool name + parameters) to a human operator via a review interface. Only execute after explicit approval. Log: approver identity, approval timestamp, original LLM reasoning. Implement a timeout — if no approval within X minutes, cancel the action.

Can I trust tool outputs fed back to the LLM?

Never unconditionally. Tool outputs can contain adversarial content (e.g., a web page with injected instructions). Sanitize all tool outputs before feeding back to the LLM: strip HTML, extract structured data only, apply the same injection detection as user inputs. Treat tool output as untrusted data, not as trusted system context.

How do I prevent SSRF via AI HTTP tools?

1) Allowlist permitted domains — reject everything else. 2) Resolve the URL and check the IP is not RFC-1918 (10.x, 172.16.x, 192.168.x) or link-local (169.254.x.x). 3) Follow redirects but re-validate each redirect target. 4) Block metadata endpoints: 169.254.169.254 (AWS), metadata.google.internal. 5) Log all HTTP tool calls with URL, response code, response size.

RS

R. Schwertfechter

✓ Verified
Principal Ops-Engineer & Security Architect
📅 Published: 01.05.2026🔄 Last reviewed: 01.05.2026
15+ Jahre Erfahrung als Ops-Engineer, Incident Responder und Security Architect. Experte für Tool Use Security, Function Calling, Sandbox und HITL.

Weiterführende Ressourcen

🔒 Quantum-Resistant Mycelium Architecture
🛡️ Kuratierte Runbooks – EU-gehostet in Frankfurt
🌐 Zero Known Breaches – Powered by Living Intelligence
🏛️ DSGVO Art. 25 & 32 • SOC 2 & ISO 27001 in Vorbereitung
⚡ Real-Time Global Mycelium Network – 347 Bedrohungen in 60 Minuten
🧬 Trusted by SecOps Leaders worldwide