Most security teams skip threat modeling for their AI systems — not because they don’t want to, but because nothing fits. STRIDE was built for service boundaries that LLMs blur. OWASP’s LLM Top 10 is a useful taxonomy but not a process. Compliance checklists ask the right questions for the wrong systems. Here’s a seven-step framework that maps cleanly to enterprise AI surfaces and produces an artifact your CISO can actually sign off on.
Step 1 — Inventory your AI surfaces
You can’t model what you haven’t catalogued. List every place an LLM (or a model that wraps one) processes input from outside your trust boundary: the customer chatbot, the support assistant, the internal copilot — plus the surfaces teams forget. Code-generation tooling used by engineers. AI-powered enterprise search. Agent integrations that take action against other systems. Each surface goes on the list.
Step 2 — Map data flows
For each surface, draw where data enters, where it’s enriched (RAG, tool calls), and where it exits (response, side-effects). Annotate sensitivity at each leg: customer PII, employee data, source code, partner agreements, internal financials. The exit legs are usually the surprise — an AI Agent that reads from a sensitive system but writes to an unmonitored log file is a quiet exfiltration path.
Step 3 — Identify trust boundaries
A trust boundary is anywhere data moves from a less-trusted source into a more-trusted execution context. In an LLM system the boundaries are usually four: user input arriving at your system, retrieved content joining the prompt, tool-call arguments handed to backend services, and model output rendered to a user or fed to a downstream system. Name them — they’re your threat-enumeration anchors.
Step 4 — Enumerate threats per boundary
Walk each boundary against the OWASP LLM Top 10 as a checklist. At every boundary, ask: what happens if the data crossing here is adversarial? Prompt injection, insecure output handling, training-data leakage, model DoS, supply-chain risk, sensitive information disclosure, insecure plugin design, excessive agency, overreliance, model theft — most apply at multiple boundaries. Document the realistic scenario at each one, not the abstract definition.
Step 5 — Score by impact × likelihood
Use whatever risk-scoring matrix your security team already runs (FAIR, qualitative 1–5, a custom impact band). The point is consistency with the rest of your security program — don’t invent a new scale for AI. Likelihood for LLM threats often comes down to “how easily can an attacker craft the input?” — direct injection is high, model-theft via API extraction is medium, supply-chain attacks against your model provider are low. Impact is the same conversation you already have for non-AI systems.
Step 6 — Define mitigations
For each high-priority threat, document the control. Be specific: "Input sanitization" is a non-control. "Unicode normalization to NFC + base64 detection + length cap of 4096 chars + classifier scoring with a 0.8 threshold" is a control. Each control specifies where it runs (client, gateway, model wrapper, post-processing), who owns it, and how it is tested.
Step 7 — Build continuous evaluation
Threat models age fast in a field that changes monthly. The model-evaluation loop is the artifact that keeps the threat model honest: a curated set of adversarial prompts that run against every model or system change. A drift dashboard that alerts when refusal rates drop or success rates on safe queries fall. A quarterly red-team exercise that probes patterns your test set hasn’t seen yet.
The deliverable
Output of the seven steps: one document, four to six pages, with a per-boundary threat table, a controls matrix, and a continuous-evaluation runbook. Your CISO signs the document; your engineering team owns the runbook. Both stay current. That’s what a working AI threat model looks like.
IDS AI Solutions runs this framework as part of the AI Audit Sprint. Want the worksheet your team can fill in for your own deployment? Talk to our team.
