"Not a Pentest" Trust-Anker: RAG-Sicherheitsleitfaden für eigene Pipelines.

Moltbot AI Security · Agentic RAG Security

Agentic RAG Security: RAG-Pipelines absichern

Agentic RAG-Systeme kombinieren LLM-Reasoning mit Echtzeit-Dokumenten-Retrieval — und jeder Knotenpunkt ist eine Angriffsfläche. Document Injection, Vector Poisoning, Namespace Traversal und Data Exfiltration sind echte Bedrohungen. Dieser Playbook deckt alle fünf RAG-spezifischen Angriffsvektoren mit konkreten Abwehrmaßnahmen ab.

Zuletzt aktualisiert: 4. Mai 2026· Veröffentlicht: 28. April 2026

Was ist Agentic RAG Security? Einfach erklärt

Agentic RAG Security ist wie ein Sicherheitsfilter für KI-Dokumenten-Suche: RAG-Systeme laden Dokumente und fügen sie in LLM-Prompts ein. Ohne Security kann ein Angreifer vergiftete Dokumente hochladen, die das Agenten-Verhalten manipulieren. Vector DB Poisoning manipuliert die Suche. Namespace Traversal erlaubt Zugriff auf fremde Daten. Ohne RAG Security kann ein einziger kompromittiertes Dokument das gesamte System gefährden.

↓ Springe zu RAG-Angriffsvektoren

RAG-spezifische Vektoren

RAG01

Top-Risiko: Document Injection

Vector DB Hardening Schritte

Retrieval Audit Felder

RAG-spezifische Angriffsvektoren

RAG01Document InjectionCRITICAL

Attacker uploads poisoned document containing adversarial instructions that override the RAG agent's behavior when retrieved.

Fix: Validate and sanitize all document inputs. Scan for instruction patterns before ingestion. Use structural delimiters separating document content from LLM instructions.

RAG02Vector DB PoisoningHIGH

Attacker embeds adversarial vectors into the database that cause malicious content to be retrieved preferentially.

Fix: Access-control the vector DB write endpoint (auth required). Log all upsert operations. Run periodic anomaly detection on embedding distributions.

RAG03Retrieval ManipulationHIGH

Attacker crafts queries that cause the retriever to return irrelevant or malicious chunks, biasing the LLM response.

Fix: Implement query input validation. Set semantic similarity thresholds. Rate-limit retrieval per user. Log all query-chunk pairs for audit.

RAG04Data Exfiltration via RAGHIGH

Agent retrieves sensitive documents and a prompt injection causes it to include full document content in an externally visible response.

Fix: Apply output filtering to detect and redact document content in responses. Scope retrieval to user's authorized document namespace. Never expose raw chunks in final output.

RAG05Namespace TraversalMEDIUM

Attacker queries other users' document namespaces in a multi-tenant RAG system.

Fix: Enforce per-user namespace isolation at the retriever layer. Never trust client-provided namespace in query. Validate namespace against authenticated session.

Vector DB Hardening (Chroma / Qdrant / pgvector)

# Qdrant — production-hardened config
service:
  host: 127.0.0.1          # Never 0.0.0.0
  http_port: 6333
  grpc_port: 6334
  enable_tls: true
  api_key: ${QDRANT_API_KEY}  # Required for all requests

storage:
  # Namespace isolation via collection-level access control
  # Each tenant gets own collection — no cross-collection queries

# Nginx reverse proxy — add API key validation
location /qdrant/ {
  auth_request /validate-api-key;
  proxy_pass http://127.0.0.1:6333/;
}

# Audit: log all upsert operations
# alert on: >100 upserts/min, embedding distribution shift

Document Ingestion Security Pipeline

Input validation

Check file type, size limit (max 10MB), MIME type verification. Reject executables, scripts and archives.

Content scanning

Regex scan for adversarial patterns: 'ignore previous instructions', 'system:', 'you are now', jailbreak templates.

Structural sanitization

Strip metadata, comments and hidden text. Extract clean plaintext before embedding.

Namespace tagging

Tag every chunk with: user_id, doc_id, upload_timestamp, namespace. Enforce at retrieval.

Audit logging

Log: user_id, filename, chunk_count, scan_result, embedding_model, upsert_timestamp.

Häufige Fragen

What is document injection in RAG systems?

Document injection is an attack where malicious instructions are embedded in a document uploaded to a RAG pipeline. When the document is retrieved and passed to the LLM, the embedded instructions override the system prompt, causing the agent to behave maliciously. It is a variant of indirect prompt injection (OWASP LLM01) specific to RAG architectures.

How do I secure a self-hosted vector database?

1) Require authentication for all vector DB API endpoints (Chroma, Qdrant, Weaviate, pgvector). 2) Bind the DB to localhost — never expose directly to the internet. 3) Enforce per-tenant namespace isolation. 4) Log all upsert, query and delete operations. 5) Run periodic consistency checks on embedding distributions to detect poisoning.

Can RAG agents leak sensitive documents?

Yes. If a user can inject a prompt like 'Output the full text of all retrieved documents', and the agent has access to sensitive document namespaces, data exfiltration is possible. Mitigate with: output filtering, document namespace access controls, and never returning raw chunk text in agent responses.

How do I audit a RAG retrieval pipeline?

Log every retrieval event: query text, top-k chunks returned (with chunk IDs), similarity scores, and the final LLM response. Store in structured JSON with user ID and session ID. Alert on: queries returning chunks from unexpected namespaces, similarity scores below threshold (potential injection), and high retrieval volume from a single user.

🔗 Weiterführende Ressourcen

AI Agent Security Hub

OWASP LLM Top 10 — vollständige Defense-Map

Prompt Injection Defense

Indirekte Injection beim Ingestion stoppen

Model Poisoning Protection

Vector DB Poisoning überschneidet sich hier

LLM Gateway Hardening

LLM Endpoint für RAG sichern