ChaptersEventsBlog

Applying MAESTRO to Real-World Agentic AI Threat Models: From Framework to CI/CD Pipeline

Published 02/11/2026

Applying MAESTRO to Real-World Agentic AI Threat Models: From Framework to CI/CD Pipeline
Written by Steven Leath and Ken Huang.

Every security team I talk to is having the same conversation right now. Their developers are shipping AI agents — coding assistants, autonomous workflows, LLM-powered tools that can browse the web, execute code, query databases, and send emails on behalf of users. The agents live in production.

The threat models are not.

This isn't a knowledge problem. The MAESTRO framework gave us an excellent conceptual map for understanding agentic AI threats. Its seven-layer architecture, from Foundation Models up through Ecosystem Integration, captures attack surfaces that traditional security frameworks were never designed to address. But there's a gap between understanding the framework and operationalizing it. Most teams I've worked with have read the MAESTRO paper, nodded along, and then gone right back to running the same SAST scanners they've always used — tools that see an HTTP endpoint and a database connection, but are completely blind to prompt injection chains, cross-layer trust boundary violations, and the unique risks that emerge when you give an LLM the ability to act on the world.

This post is about closing that gap. I'll walk through how we built MAESTRO classification into an automated threat modeling tool called TITO(Threat In and Threat Out), what it actually finds when pointed at real agentic AI codebases, and how to make threat modeling a continuous part of your CI/CD pipeline rather than a document that lives in Confluence and gets stale the moment it's written.

 

The Fundamental Problem with Threat Modeling AI Agents

Traditional threat modeling works by decomposing a system into components, identifying trust boundaries, and applying a classification framework — usually STRIDE — to enumerate potential threats at each boundary. This works well for conventional architectures because the trust boundaries are relatively static and well-understood. The web server trusts the database. The API gateway validates tokens. The input is sanitized before it hits the query.

Agentic AI systems break this model in a fundamental way. The trust boundaries aren't static — they're dynamic and defined by the LLM's behavior at runtime. Consider what happens when a user sends a message to an AI agent with tool access:

The user's input enters through a chat API. So far, so normal — this is a trust boundary that any scanner can identify. But then the input reaches the LLM, and something unprecedented happens. The model interprets the input, reasons about it, and decides what actions to take. It might decide to read a file, execute a shell command, query a database, and send the results to a webhook — all from a single user message. Each of those actions crosses a different trust boundary, and the decision about which boundaries to cross is made by a probabilistic model that can be manipulated through the input itself.

This is prompt injection, and it's not just a theoretical risk. It's a new class of code execution vulnerability where the "code" is natural language and the "interpreter" is an LLM with system privileges. An attacker doesn't need to find a buffer overflow or an SQL injection when they can simply ask the agent to do dangerous things — or more subtly, embed instructions in data the agent will consume later through its memory or RAG system.

MAESTRO captures this through its layer model. Layer 1 (Foundation Model) defines the base risk of LLM integration. Layer 3 (Agent Frameworks) addresses the reasoning loop and tool dispatch.  Layer 3 (Agent Frameworks) covers the actual execution capabilities. The insight is that threats don't just exist at individual layers — they chain across layers, and the most dangerous attack paths are the ones that start at Layer 1 and cascade through to Layer 4 and beyond.

The 7 layers of the MAESTRO framework are depicted below. TITO focus on threats in Layer 1 to 3 since TITO operates on scanning codes and integrated to CI/CD pipeline which are layer 1 and 3 artifacts.

the 7 layers of the MAESTRO framework

The Agent Ecosystem layer is where traditional security intuition breaks down most dramatically. At this layer, threats no longer map cleanly to individual vulnerabilities or discrete components. Instead, risk emerges from the composition of agents, tools, memory, and external integrations — often across organizational and trust boundaries. An individual function call may appear benign in isolation, but when orchestrated by an LLM-driven planner, invoked through a tool registry, and informed by mutable long-term memory, it becomes part of an attack surface that spans time and intent. MAESTRO’s Agent Ecosystem layer gives security teams a way to reason about that emergent behavior, rather than chasing individual findings that miss the broader narrative.

The problem is that none of this shows up in a traditional scan. Your SAST tool sees `exec.Command("bash", "-c", cmd)` and flags a command injection — which is correct but misses the point entirely. The real threat isn't that the code calls exec. It's that the code calls exec with arguments determined by an LLM processing untrusted input, and that the LLM's decision to make that call can be influenced by prompt injection. The attack surface is the entire chain, not any single link.

 

Building MAESTRO Into the Scanner

TITO started as a straightforward threat modeling tool — point it at a repository, identify assets and data flows, classify threats with STRIDE-LM. But when we started scanning agentic AI codebases, the limitations became obvious immediately. The tool was finding real issues, but it was missing the agentic dimension entirely. It would flag a database query as a potential SQL injection without recognizing that the query parameters originated from an LLM's tool call, which in turn originated from user input that crossed three trust boundaries with zero validation between them.

Adding MAESTRO classification changed what the tool could see. When TITO scans a repository now, it first identifies the agentic AI patterns present in the codebase: LLM API integrations, tool registries, agent loops, memory systems, multi-agent communication channels. It maps these patterns to MAESTRO layers and then analyzes threats within the context of those layers.

Operationalizing MAESTRO required treating its layers not as labels, but as execution context. Rather than asking “what vulnerability is this?”, TITO asks “where in the MAESTRO stack does this behavior originate, and how does it propagate?” Each finding is evaluated in terms of its upstream inputs, downstream actions, and cross-layer amplification. This allows the scanner to surface threats that only exist because an LLM is present — such as prompt-derived control flow, tool misuse driven by indirect input, or persistence achieved through agent memory — and to describe them in a way that aligns with how agentic systems actually operate.

The difference is significant. A hardcoded API key is always a critical finding, but in an agentic AI system, it's not just a credential exposure problem — it's a foundation layer compromise that affects every downstream agent behavior. A file operation without path validation is a path traversal risk in any application, but in an agent with tool access, it's a mechanism for persistent compromise through the agent's memory system. MAESTRO context transforms individual findings into attack narratives.

 

What We Found: Scanning a Real AI Agent

To make this concrete, let's walk through an actual scan. The target is a representative agentic AI system — an LLM-powered assistant with five tool capabilities: shell execution, file read/write, web browsing, database queries, and external notifications. It exposes a chat API, stores conversations in PostgreSQL, and generates embeddings for a RAG memory system. If you're building with LangChain, CrewAI, AutoGen, or any agent framework, this architecture will look familiar.

What was most striking was not the number of issues identified, but how differently they presented when viewed through the MAESTRO lens. Several findings that would normally be triaged as medium or informational became high-impact once their role in the agent’s decision-making loop was understood. In multiple cases, user-controlled input influenced tool invocation indirectly, crossed multiple trust boundaries, and persisted via memory — a pattern that only became visible when threats were analyzed across MAESTRO layers rather than as isolated defects. This reframing changed not just prioritization, but remediation strategy, shifting the focus from patching endpoints to constraining agent behavior.

Running TITO with MAESTRO and MITRE ATT&CK enrichment takes about three seconds:

```bash tito scan --repo ./ai-agent --maestro --mitre --attack-paths --report threat-model.md ```

TITO discovered 24 assets across six categories — APIs, databases, secrets, network interfaces, filesystem operations, and cryptographic functions — with 169 data flows between them. It then identified 9 distinct threats, classified them across STRIDE-LM and MAESTRO layers, and mapped each to relevant MITRE ATT&CK techniques.

The headline finding itself wasn’t surprising: a critical credential exposure where the OpenAI API key flows from environment configuration into HTTP headers sent to an external endpoint. Every scanner catches this, and it is rightly treated as a high-severity issue. On its own, this finding reflects a familiar class of risk — credential leakage that can lead to unauthorized API usage, cost overruns, or data exposure. Traditional tooling does a good job here, and this is not where the analysis became interesting.

What made the TITO report different was what came next.

The second finding was an LLM prompt injection risk at MAESTRO Layer 1. TITO identified a common agentic pattern: user-controlled input flowing directly into a message array that is passed to the model API without semantic validation or constraint. In isolation, this is often dismissed as an abstract or theoretical concern. However, TITO did not stop at identifying the prompt injection vector. It traced the downstream effects of that input as it propagated through the system.

The LLM’s responses flowed directly into a tool dispatch mechanism with no permission model, policy enforcement, or capability scoping. That dispatcher exposed multiple high-impact tools: shell command execution, arbitrary file read/write, database queries, and outbound network notifications. There was no validation, authorization check, sandboxing, or contextual guardrail between the LLM’s output and tool execution. In other words, the system implicitly trusted the LLM to behave safely while granting it the ability to act with the full privileges of the application.

In MAESTRO terms, this forms a clear Layer 1 → Layer 3 → Layer 4 attack chain: user input influences model behavior (Layer 1), the model drives agent planning and tool selection (Layer 3), and those tools directly affect external systems and the environment (Layer 4). In practical terms, this means that a successful prompt injection yields the equivalent of an authenticated shell on the host, read/write access to the database, and a built-in data exfiltration channel via the notification webhook. This is not a hypothetical escalation; it is a direct consequence of how the agent is architected.

The risk compounds further because the agent persists conversation history in long-term memory. A single successful prompt injection does not merely trigger a one-time action — it can poison the agent’s context for all subsequent interactions. This creates a persistent compromise where malicious instructions or assumptions are silently reinforced over time, even after the original user input is no longer present. Traditional scanners have no way to represent or reason about this kind of temporal, stateful risk.

The four medium-severity findings tell a more subtle but equally important story. Each was classified as an “unvalidated trust boundary crossing”: data flowing from one component to another without checks, normalization, or enforcement. Individually, each finding looks minor — the sort of issue that is often deprioritized or accepted as technical debt. But viewed together, they describe the complete absence of a defense-in-depth strategy.

There is literally no point in the agent’s execution pipeline where input is validated between the user and the database. User input flows into the LLM, the LLM’s output flows into tool invocation, and tool output flows into persistent storage and external systems — all without meaningful constraint. TITO’s attack path analysis connects these findings into a single coherent narrative, showing how small, individually “acceptable” gaps combine into a full end-to-end compromise path. This is not something a list of isolated findings can communicate.

This is what MAESTRO brings to automated threat modeling that STRIDE alone cannot. STRIDE is effective at describing what can go wrong at a specific component or boundary. MAESTRO explains how agentic AI behaviors — planning, tool use, memory, and autonomy — create entirely new ways for things to go wrong across components, over time, and at system scale. That difference is not academic; it fundamentally changes how these systems must be secured.

 

Making It Continuous: Threat Models in Your Pipeline

A threat model report is useful. A threat model that runs on every pull request and blocks merges that introduce critical agentic AI threats is transformative. The difference is not just automation; it’s enforcement. Instead of being a point-in-time artifact created during design reviews, the threat model becomes an active control that continuously governs how the system is allowed to evolve.

The biggest insight from deploying TITO across multiple teams is that threat model drift is the real enemy. Initial assessments are often thorough and well-reasoned, but agentic AI systems are inherently dynamic. Codebases change constantly. A developer adds a new tool to the agent. Another modifies the system prompt to improve performance. Someone integrates a third-party plugin or enables a new memory backend. Each of these changes subtly reshapes the agent’s capabilities and trust boundaries, often in ways that the original threat model never anticipated. Without continuous enforcement, the threat model quickly becomes outdated — and eventually irrelevant.

TITO addresses this through two complementary mechanisms. The first is CI/CD integration that makes threat modeling automatic and unavoidable. Adding TITO to a GitHub Actions workflow is a single step:

uses: Leathal1/TITO@v2

with:

  maestro: true

  mitre: true

  fail-on: critical

Every pull request is now scanned as part of the normal development workflow. The resulting threat report is posted directly to the PR as a comment, including a severity breakdown, affected code locations, and concrete remediation guidance. If the scan identifies a critical agentic AI threat, the PR fails. There is no separate review process, no threat model document to update by hand, and no security team bottleneck. The same mechanism that enforces unit tests and linting now enforces security invariants for agentic behavior.

The second mechanism is threat diffing, and this is where the CI/CD approach becomes especially powerful. Rather than re-scanning the entire codebase on every PR — which repeatedly surfaces the same known findings and quickly leads to alert fatigue — TITO diffs the threat model between the PR branch and the base branch:

uses: Leathal1/TITO@v2

with:

  diff-base: ${{ github.event.pull_request.base.sha }}

  fail-on: high

The PR comment now shows only the threats introduced or materially changed by that specific pull request. A developer adding a new tool to the agent sees exactly what new attack surface that tool creates. A developer modifying the system prompt sees whether the change increases prompt injection risk or alters downstream tool behavior. The signal-to-noise ratio is dramatically higher than a full scan, and the feedback arrives at the exact moment when it is cheapest and easiest to act on.

Teams that adopt this workflow consistently report a measurable behavioral shift. When developers see threat diffs on every PR, security considerations move upstream into everyday coding decisions. Developers add validation between MAESTRO layers because they know unvalidated trust boundary crossings will be flagged immediately. They introduce permission checks and capability scoping in tool dispatch because the PR diff makes the violation explicit. Over time, the threat model stops being a static document that lives in a wiki and becomes a living part of the codebase — continuously updated, continuously enforced, and tightly coupled to how agentic AI systems are actually built and maintained.

 

Patterns We're Seeing

After scanning dozens of agentic AI codebases, several patterns have emerged that are worth highlighting for anyone building or securing these systems.

First, every agent with tool access has critical findings. This is not a comment on code quality — it's a structural property of the architecture. If your agent can execute code, write files, or query databases, there are attack paths from user input to those capabilities. The question isn't whether you have threats; it's whether you've identified them, accepted the risk consciously, and implemented appropriate mitigations. TITO makes those threats visible so the conversation can happen.

Second, trust boundary validation between MAESTRO layers is the most common and most dangerous gap we see. In the vast majority of agentic AI codebases we’ve scanned, data flows directly from user input through the LLM and into tool execution with effectively zero validation at any point in the pipeline. User-controlled text becomes model input, model output becomes executable intent, and that intent is acted on by tools that can read files, run commands, or query databases — all without meaningful constraint.

This is not the result of carelessness or poor engineering discipline. It is a natural consequence of how modern agent frameworks are designed. These frameworks optimize for capability, flexibility, and developer experience. They make it trivial to wire an LLM to a growing set of tools and allow the model to decide how and when those tools are invoked. What they do not provide, by default, is a notion of trust boundaries between layers, or primitives for validating, authorizing, and constraining behavior as control passes from one layer to the next.

From a MAESTRO perspective, this collapses multiple layers into a single implicit trust domain. Layer 1 (user input and model interaction), Layer 3 (agent planning and control), and Layer 4 (tool and environment interaction) effectively operate as one continuous execution path. When that happens, a single prompt injection at the outermost layer is enough to influence every downstream capability the agent possesses. There is no opportunity for the system to reassert policy, enforce invariants, or apply defense-in-depth.

The result is that prompt injection is no longer just an input sanitization problem — it becomes a systemic control-flow vulnerability. An attacker does not need to bypass authentication, exploit memory corruption, or chain multiple low-level bugs. They only need to convince the model to behave in a way that the system has already granted it permission to execute. Once the model is compromised, the entire tool chain is compromised by design.

This pattern shows up repeatedly across codebases, regardless of language, framework, or team maturity. Agents with richer toolsets and longer context windows are especially vulnerable, because each additional tool and each additional layer of memory amplifies the impact of a single trust boundary failure. Without explicit validation and authorization between MAESTRO layers, the agent’s autonomy becomes indistinguishable from full system trust — and that is the core security failure MAESTRO is designed to make visible.

Third, the "helpful assistant" pattern creates the largest attack surfaces. Agents designed to be maximally capable — broad tool access, minimal restrictions, long context windows, persistent memory — are also the most dangerous from a security perspective. This creates a genuine tension between capability and security that MAESTRO helps frame clearly.  Layer 3 (Agent Frameworks) threats multiply with every tool you add. Layer 3 (Agent Frameworks) threats multiply with every iteration the agent can take. Understanding this tradeoff explicitly is the first step toward making informed decisions about it.

Fourth, the combinatorial nature of agentic AI threats creates a mitigation challenge that traditional prioritization struggles with. When TITO generated 204 mitigation recommendations for a moderately complex agent, it revealed that agentic AI systems have a fundamentally larger mitigation surface than conventional applications. Every tool multiplies the threat surface. Every data flow between layers creates a new validation point. TITO's priority ranking helps by identifying which mitigations address the most threat instances — in our example, securing the data layer (encryption and secrets management) addressed nearly half of all identified threats, matching MAESTRO's emphasis on data integrity across layers.

 

What Comes Next

TITO is open source and free for static threat analysis. The MAESTRO integration, MITRE ATT&CK mappings, attack path analysis, and 3D visualizations all work out of the box with a single binary — no API keys, no cloud service, no data leaving your machine.

```bash go install github.com/Leathal1/TITO/cmd/tito@latest tito scan --repo . --maestro --mitre --attack-paths --3d --output threats.html ```

What we're working toward is making MAESTRO as operationally natural as STRIDE has become for traditional applications. Every agentic AI codebase should have a continuously updated threat model. Every PR should show its threat delta. Every developer building with AI agents should understand — concretely, in terms of their own code — what attack surface they're creating and what mitigations they need.

On the tooling side, the immediate roadmap focuses on closing the gap between static analysis and runtime reality. Threat model drift detection — automatically identifying when a codebase has changed enough that its threat model is stale — is the next step toward continuous assurance. Today, TITO can diff threats between branches. The natural extension is tracking threat model freshness over time: alerting when new agent capabilities are added without corresponding threat analysis, when tool registrations change, or when data flows shift in ways that invalidate prior assumptions.

Extending coverage across MAESTRO's full layer stack is equally important. Static analysis currently goes deep on Layers 1 through 3, where code-level patterns are most visible. Reaching Layers 4 through 7 — deployment infrastructure, observability, compliance, and ecosystem integration — will require pairing static analysis with runtime signals: infrastructure-as-code scanning, observability pipeline analysis, and policy-as-code evaluation.

From the framework side, the CSA is actively evolving MAESTRO based on real-world feedback from implementers — including insights from exactly this kind of tooling integration. As more organizations operationalize the framework, common patterns are emerging around which layer transitions carry the most risk and where the framework needs additional granularity. This feedback loop between framework development and automated tooling is critical: frameworks that exist only in whitepapers have limited impact, but frameworks encoded in scanners and enforced in pipelines become part of how software is built.

There are thousands of AI agents in production today with no threat model of any kind — not because the teams building them don't care, but because the tooling wasn't there when they shipped. MAESTRO provides the conceptual vocabulary. Automated tooling provides the operational path. Together, they make it realistic for every team shipping an AI agent to understand its threat profile as a continuous property of the codebase, not a one-time exercise.

The framework exists. The tooling is catching up. The remaining challenge is adoption, and that requires making threat modeling so frictionless that it's easier to do it than to skip it.

 


Steven Leath is an Application Security Engineer and the creator of TITO. Ken Huang is the creator of the MAESTRO framework at the Cloud Security Alliance.

Unlock Cloud Security Insights

Unlock Cloud Security Insights

Choose the CSA newsletters that match your interests:

Subscribe to our newsletter for the latest expert trends and updates