The Cyber Archive

Securing organizations ML & LLMops deployments : A platform architects...

Learn to close the real security gaps in AWS Bedrock and Azure AI defaults — IAM, guardrails, private networking, and confused deputy risks in agentic pipelines.


Sai Gunaranjan & Kyler Middleton presenting talk - Securing organizations ML & LLMops deployments : A platform architects journey onboarding LLM &... at fwd:cloudsec North America 2025
Sai Gunaranjan & Kyler Middleton presenting talk - Securing organizations ML & LLMops deployments : A platform architects journey onboarding LLM &... at fwd:cloudsec North America 2025

An internal chatbot trained on your company’s knowledge base just accurately predicted stock purchase timing for a publicly traded company — and your developers didn’t even notice it was ingesting product launch schedules. Securing ML and LLMOps deployments isn’t just a configuration exercise; it’s a race against an attack surface that expands every time an engineer ships a new AI feature without security review.

This post distills a platform architect’s real-world experience hardening AI deployments on AWS Bedrock and Azure AI. You’ll get a concrete checklist of gaps that ship by default, patterns for enforcing guardrails developers can’t simply bypass, and an architecture for keeping machine identities — bots, MCP servers, and agentic pipelines — from becoming a confused deputy liability.

Key Takeaways

  • You'll learn how to identify the critical security gaps in default AWS Bedrock and Azure AI deployments — including missing resource policies, unenforced guardrails, and insecure network defaults — so you can close them before attackers exploit them.
  • You'll be able to design a secure multicloud LLMOps architecture with proper IAM scoping, private networking, knowledge base access controls, and audit logging on both AWS and Azure.
  • Apply the principle of least-privilege to AI machine identities (bots, MCP servers, agentic pipelines) to prevent confused deputy attacks and transitive data over-exposure.

The Expanding AI Attack Surface: Data, Bots, and Machine Identities

AI Systems Are Data Funnels — Scope Your Threat Model First

The foundational challenge in securing ML and LLMOps deployments isn’t a configuration problem — it’s a data access problem. When your organization deploys an AI chatbot or agentic pipeline, that system immediately becomes a high-trust data proxy. As the speakers put it bluntly: “Bots act as a funnel for data. Anything that a bot can access, that user can now transitively access that data. They are going to give it to anyone that asks. They’re very trusting employees.”

This transitive exposure is the core threat model that security engineers must internalize before evaluating any cloud configuration. It doesn’t matter how well your guardrails are tuned or how tight your IAM policies are if you haven’t answered a more fundamental question first: where is your data, and who can your AI systems reach?

The Three Expanding Dimensions of AI Attack Surface

Three interconnected trends are expanding your organization’s AI attack surface simultaneously:

1. Data Propagation Across Clouds

Enterprise data is spread across multiple environments — SaaS tools, proprietary data warehouses like Snowflake[1], internal databases, document stores, and cloud-native storage. AI systems, especially RAG-based knowledge bases, are being trained or connected to all of it. Much of that data hasn’t been classified for AI consumption. If data is sensitive and you feed it to an AI with only a system prompt telling it not to share, “it probably won’t — but is probably good enough for your organization and your data classification.” That’s not a security control. That’s hope.

The practical implication: before a knowledge base ingests anything, your data classification program needs to have caught up with your AI deployment timeline. In most organizations, it hasn’t.

2. Agentic Bots as Machine Identities on Your Network

Machine identities — bots, MCP servers, agentic pipelines — are now acting as users on your network. They authenticate to APIs, read data, process it, and increasingly make changes. The speakers described these as “users that do anything that anyone asks them to do, which is probably a little bit problematic for most data classifications and most APIs.”

This is a fundamental shift in how you model identity risk. Traditional IAM thinking centers on human users and service accounts with bounded, well-defined scopes. An agentic bot with broad data access and no enforcement of least privilege effectively becomes an insider threat vector that operates at machine speed, around the clock, and responds to anyone who queries it.

3. AI Tooling Proliferating Faster Than Security Controls

The Model Context Protocol (MCP)[2] has matured rapidly, enabling agentic bots to interact programmatically with APIs and tooling. The Agent-to-Agent (A2A) protocol has now moved under the Linux Foundation and will likely become a standard implementation target across enterprises within the year. Meanwhile, AI capabilities are being embedded in tools your organization already uses — document readers, IDE plugins, enterprise software platforms — without any security review or deployment process.

The speakers specifically called out Adobe Reader as an example: “It now has generative AI to summarize PDFs. Is that something you need to secure and worry about? Is that something that is going to take advantage of MCP on your network in the future?” These are not hypothetical future risks. They are present-day gaps that most threat models haven’t caught up to.

The Confused Deputy Problem in Agentic Systems

The confused deputy problem — where a privileged automation is leveraged to perform actions beyond its intended scope — is a well-established security concept, but AI introduces new dimensions of risk. The classic example is a Terraform pipeline with broad IAM permissions being exploited to exfiltrate secrets. With agentic bots, the dynamic is more fluid.

MCP initially worked by granting a shared permission set to all users that interacted with a given server — a design the speakers characterized as “terrifying.” The protocol is evolving to support OAuth 2 token forwarding, where the bot prompts the user to prove their identity and proxies that credential downstream. This is an improvement, but it doesn’t eliminate the risk — it shifts it.

As the speakers explained, if a user with administrative privileges interfaces with an MCP server and passes through a token that represents those full administrative rights, the bot now operates with admin-level access. The correct mitigation is privilege pruning at the MCP layer: validating that tokens are legitimate and originated from your network, scoping what rights the token actually confers for the specific data or API being accessed, and ensuring the bot cannot inherit all the rights of the administrative user who triggered it.

AI Engineering Is Data Engineering With a Security Problem

A recurring theme is that AI engineering is, at its core, data engineering. The questions that define good AI security posture are the same ones that define good data governance: Where is your data stored? How is it classified? Who can access it? How is it protected at rest and in transit? How often is it refreshed, and how is it moved across the network?

The difference is the stakes. An AI knowledge base that ingests unclassified sensitive data doesn’t just expose that data to authorized users — it exposes it to anyone who can query the bot, in any form the bot can articulate it. The speakers shared a real-world case where an internal AI assistant trained on company knowledge bases was accurately predicting optimal stock purchase timing for a publicly traded company, because it had ingested product launch schedules and delay information. The bot wasn’t doing anything wrong by its own logic — it was answering questions accurately. The problem was that the data it could reach should never have been fed into it.

That example is the threat model in miniature: your AI systems will give away any data they can access to anyone who asks. Scoping that access correctly — before deployment, not after — is the foundational security control that every subsequent cloud configuration depends on.

Shift-Left Before You Configure

The speakers emphasized that security teams that aren’t integrated early into AI feature development end up in a reactive posture: developers ship an AI feature, security finds out, and the response is “turn it off.” That’s not sustainable as AI deployment velocity increases.

The same shift-left principle that shaped DevSecOps applies here: integrate security review into the AI feature development lifecycle from the start. Classify data before it’s fed to any model. Define the scope of machine identity permissions before a bot is deployed. Review what data sources an agentic pipeline can reach before it goes live. The alternative — hardening after the fact — is dramatically more expensive and leaves real exposure windows open.

Actionable Takeaways

  • Inventory every data source your AI systems can reach — knowledge bases, document stores, databases, third-party integrations — and classify each for AI suitability before any ingestion occurs. Assume any data a bot can access will be disclosed to any user who queries it.
  • Treat machine identities (bots, MCP servers, agentic pipelines) as first-class IAM principals. Apply least-privilege scoping to every bot identity, and implement OAuth 2 token validation with privilege pruning at the MCP layer to prevent confused deputy exploitation.
  • Integrate security review into AI feature development from the start — not after deployment. Define access scopes, data classification requirements, and guardrail policies before an AI feature ships, the same way you would for any externally-facing API.

Common Pitfalls

  • Feeding unclassified or over-permissioned data into AI knowledge bases and relying on system prompt instructions ("don't share this") as the primary data protection control. System prompts are not access controls — they are suggestions that AI systems may not reliably honor, especially under adversarial prompting.
  • Treating MCP token forwarding as a complete solution to the confused deputy problem without implementing privilege pruning. If an admin user triggers an agentic bot and passes through a token with full administrative rights, the bot inherits those rights. Token forwarding addresses identity proof, not scope reduction.

AWS Bedrock Security Architecture: IAM, Guardrails, and Logging Gaps

Securing ML and LLMOps deployments on AWS starts with understanding what Bedrock[3] actually is — and what it is not. Bedrock operates at scopes three and four: it gives you access to pre-trained and fine-tuned foundational models, including serverless pay-per-token inference. You are not training your own model here. If you need to pre-train or run custom models, that is SageMaker[4] territory — and at roughly $30,000 per month for full model training, most teams will stick with Bedrock for experimentation and production chatbot use cases.

The Bedrock RAG Pipeline Architecture

AWS Bedrock RAG pipeline security architecture diagram

The most common production pattern for AWS Bedrock security architecture combines several managed services into an asynchronous pipeline. Understanding this architecture is essential before you can reason about where controls apply and where they do not.

How the pipeline works:

  1. A system input (chatbot, API, or application) sends a user request.
  2. A Lambda receiver[5] immediately returns HTTP 200 to satisfy webhook timeouts — Bedrock responses take longer than most webhook SLAs allow.
  3. The receiver asynchronously invokes a Lambda worker (the real processor).
  4. The worker queries the knowledge base (backed by OpenSearch or Aurora with pgvector[6]), requesting 50–75 semantic chunks related to the user query.
  5. A re-ranker model reads the user request and all retrieved chunks, winnowing them down to the most relevant subset. This step takes roughly 250ms and dramatically improves response quality with minimal cost — always enable re-ranking.
  6. The refined context, system prompt, and user request are passed through Bedrock guardrails, which police tokens both inbound and outbound.
  7. A foundational model generates the final response. If the response passes guardrails, it is returned to the user.

This entire pipeline — knowledge base retrieval, re-ranking, guardrail filtering, and foundational model inference — completes in roughly 3–6 seconds. At roughly one dollar per day for ~150 requests, it is an extremely cost-effective way to run a production internal chatbot.

Foundational Models and the Converse API

One frequently overlooked constraint: foundational models accessed directly via the Bedrock API do not support document uploads (PDFs, Excel, PowerPoint, images). The documentation for models like Claude Sonnet mentions document support, but that support applies only when calling the model’s native API directly — not through Bedrock.

The solution is the AWS Bedrock Converse API[7], a meta-API that normalizes requests across all foundational models and adds document ingestion support. It abstracts away per-model API differences, making it trivial to swap foundational models without changing application code. For any deployment where users may submit free-form documents, the Converse API is the correct integration point.

IAM: The Only Access Control You Have (For Now)

Principal IAM policies are validated on every Bedrock API call — this part works as expected. However, the security model breaks down at the resource level.

The critical gap: there are no resource policies for AWS Bedrock. Unlike S3 buckets, Bedrock knowledge bases, models, and agents are not real AWS resources in the policy sense. You cannot attach a resource policy to a knowledge base to restrict which principals can query it. If your knowledge base contains your company’s most sensitive data, there is currently no way to lock it down at the resource level.

The only available control is to audit every IAM principal in the account and ensure none of them have Bedrock permissions they should not have. This is the inverse of least-privilege — instead of granting access to the resource, you must explicitly deny or omit Bedrock permissions from every principal that should not have it. AWS has indicated that resource policies are on the roadmap, but they are not available today.

Additionally, AWS Config cannot be used to audit Bedrock. Because Bedrock resources are not real AWS resources, there are no Config rules available to enforce or detect misconfigurations. Your compliance tooling will have a blind spot here.

Guardrails: Powerful but Bypassable

Bedrock guardrails provide content filtering in both directions — they can block harmful categories (hate speech, sexual content, violence), enforce topic restrictions, and prevent sensitive data from being returned in responses. The insider trading example is instructive: a knowledge base that ingested product launch schedules was accurately predicting stock purchase timing before topic-based guardrails were applied.

Guardrails are specified per-API-call. When a caller includes the guardrailIdentifier and guardrailVersion parameters, the guardrail is applied. When a caller omits these parameters, guardrails are simply not applied. There is no way to enforce guardrail usage at the account or knowledge base level.

This means your developers can — intentionally or through oversight — bypass all of your content filtering simply by omitting two parameters from their API calls. Unlike Azure (where a guardrail equivalent is attached to the model deployment itself and cannot be bypassed), Bedrock’s guardrail enforcement is entirely client-side.

Practical implication: Security teams cannot rely on guardrail adoption without developer coordination. Platform-level enforcement is not available. You must build guardrail usage into your internal SDKs, infrastructure-as-code templates, or API wrappers — and validate compliance through CloudTrail log analysis.

Insider Trading Risk: Knowledge Base Ingesting Product Launch Schedules

Proof of Concept

  1. Unclassified data ingested into knowledge base: The platform team configured an AWS Bedrock knowledge base to ingest broad internal documentation — including product launch dates, delay notifications, and internal communications — without first classifying which data was sensitive or material non-public information (MNPI). The ingestion pipeline used a foundational model to parse and embed unstructured documents (Word docs, PDFs, schedules) into a vector database (OpenSearch or Aurora via pgvector).

  2. User queries trigger RAG retrieval: An internal chatbot powered by the Bedrock RAG pipeline received a natural-language query from a user asking when they should purchase stock in the publicly traded company. The Lambda worker handling the request queried the knowledge base, which returned a large set of relevant vector chunks — including product milestone dates, delay notices, and launch schedule updates — ranked and filtered by a re-ranker model.

  3. Foundational model synthesizes MNPI into actionable guidance: The filtered chunks were passed along with the user query to a Bedrock foundational model (e.g., Claude Sonnet via the Converse API). Because the knowledge base contained accurate, up-to-date product launch information, the model was able to synthesize a response that was “scary accurate” about when to buy or sell stock — effectively functioning as an insider trading advisory tool.

  4. No guardrail was enforced at the API layer: The Bedrock guardrail system, which can block specific topics and filter output tokens, was not enforced on this particular API call. In AWS Bedrock, guardrails are specified as optional parameters on each API call — if a developer omits the guardrail parameter, the guardrail is simply not applied. The organization discovered the gap only after the model had already produced the problematic response.

  5. Remediation — topic guardrail added: Once the risk was identified, the team added an explicit guardrail topic blocking any discussion of stock purchase timing, investment decisions, or content that could constitute insider trading advice. This guardrail was configured in the Bedrock console using the AI-for-your-AI topic filtering capability, which uses a secondary model to evaluate whether a response falls into a prohibited subject area before returning it to the user.

  6. Root cause — absence of data classification before ingestion: The underlying issue was that no data classification review was performed prior to feeding internal documentation into the knowledge base. Material non-public information (product launch dates, delay schedules) was treated the same as general internal knowledge. The fix required both the guardrail addition and a retroactive review of what data the knowledge base was permitted to ingest — establishing a classification gate before any new data source is connected to the RAG pipeline.

  7. Key lesson — bots are indiscriminate data funnels: Bots act as funnels for any data they can access. Any user interacting with the chatbot could transitively access any information the bot had ingested — regardless of whether that user had direct access to the underlying source document. Sensitive data fed to a bot is, in practice, accessible to anyone who can talk to the bot.

AWS Bedrock Guardrail Bypass: Developers Omitting the Guardrail Parameter on API Calls

Proof of Concept

  1. Understand the enforcement model difference: In Azure AI, when a model is deployed to an AI hub or ML workspace, the content filter configuration is attached to the deployment itself. Any application that calls that deployment inherits the content filter — it cannot be disabled per request. In AWS Bedrock, there is no equivalent deployment-level binding. Guardrails exist as separate resources with their own IDs, but they must be explicitly referenced on each individual API call.

  2. Identify the bypassable API parameter: When a developer calls the Bedrock InvokeModel or Converse API, the guardrail is specified as an optional field (e.g., guardrailIdentifier and guardrailVersion on the Converse API). If a developer omits these parameters — whether intentionally, by accident, or because they copied a code snippet that predated the guardrail deployment — the call proceeds without any guardrail evaluation. The Bedrock platform does not reject or flag the unauthenticated call.

  3. Reproduce the bypass: A developer building a new chatbot feature references an older code example or SDK snippet that does not include guardrail parameters. They call the Converse API with a valid IAM principal, a valid foundational model ID, and a user prompt — but no guardrailIdentifier. The call succeeds. The foundational model returns a response. No guardrail topic denial, harmful content filter, or sensitive data redaction is applied. The response reaches the user unfiltered.

  4. Contrast with a guardrail-enforced call: When the same developer includes the correct guardrailIdentifier and guardrailVersion fields referencing the organization’s configured guardrail resource, Bedrock evaluates the input tokens against the configured denied topics, harmful category filters, and sensitive data patterns before the foundational model ever processes the request. The difference in security posture between these two calls is total — and both are syntactically valid.

  5. Observe the lack of platform-level recourse: The speakers confirmed there is no AWS Config rule available for Bedrock resources because Bedrock does not expose config-auditable resources in the traditional sense. CloudTrail[8] logs will record the API call and the absence of a guardrail parameter — but only if someone is actively analyzing those logs for missing fields. There is no native mechanism to deny or block API calls that lack a guardrail reference. The only available control is principal IAM policy, which restricts who can call Bedrock at all — not whether they must use a guardrail when they do.

  6. Organizational impact: Because the guardrail is optional and invisible to the end user, a developer bypass produces no visible error, no alert, and no user-facing indication that security controls were skipped. A single developer omitting the guardrail parameter on a new code path can reopen that exact exposure without realizing it.

  7. Recommended mitigations given current AWS constraints:

    • Audit CloudTrail logs for Bedrock Converse API and InvokeModel calls missing the guardrailIdentifier field. Build a CloudWatch[9] metric filter or SIEM rule to alert on these.
    • Enforce guardrail parameter inclusion through internal SDK wrappers or API gateway middleware that intercepts Bedrock calls, injects the correct guardrail reference, and prevents calls without it from reaching Bedrock directly.
    • Implement a shift-left control: integrate guardrail parameter presence as a code review checklist item and a static analysis rule in CI/CD pipelines for any service that calls Bedrock.
    • Monitor the AWS roadmap for Bedrock resource policies, which would allow attaching IAM conditions to knowledge bases and model invocations — the eventual platform-native fix for this gap.

Audit Logging: CloudTrail and CloudWatch

CloudTrail is your primary audit mechanism for Bedrock. It logs who called which API verb against which Bedrock resource — enabling detection of unauthorized model usage, unexpected cost spikes, and anomalous access patterns. You should have automated tooling analyzing CloudTrail continuously: if a principal suddenly begins hammering serverless Bedrock endpoints and generating significant token charges, you need to catch it quickly, because there are no resource-level rate limits or access controls to stop it.

CloudWatch provides dashboards, usage metrics, and conversation recording. You can build dashboards tracking call volume by model and over time. Conversation recording is also available — but with a significant limitation: recording is enabled per region and account, not per deployment or agent. When you enable conversation logging, all AI requests across all models in that region flow into a single log group. There is no native way to segregate logs by chatbot application, agent, or use case. This makes forensic investigation and per-deployment auditing substantially more difficult than the Azure equivalent.

Known Gaps Summary

Gap Status Workaround
No Bedrock resource policies Coming (per AWS roadmap) Audit and restrict principal IAM policies
No AWS Config rules for Bedrock No roadmap Manual auditing only
Guardrails are optional per API call No fix announced Enforce via internal SDKs and wrapper libraries
Conversation recording not scoped per deployment No fix announced Filter CloudWatch logs by model ARN or timestamp
Knowledge base web crawler lacks authentication Beta Reverse proxy (not recommended for production)
Knowledge base does not support Aurora as backend No roadmap Use OpenSearch (~40x more expensive)

Vector Database Cost Considerations

The knowledge base requires a vector database backend. OpenSearch is the most commonly used option but costs approximately $40 per day. Aurora with pgvector costs under $1 per day — roughly 40 times cheaper. Aurora is not yet supported as a native Bedrock knowledge base data source, so teams that want the cost advantage must build a custom retrieval layer. This is worth planning for if you are scaling beyond experimentation.

Actionable Takeaways

  • Treat CloudTrail as your only real-time guardrail against Bedrock misuse: build automated alerting on unexpected API callers, sudden spikes in token usage, and any invocation that lacks a guardrail ID in the request parameters. This is the closest thing to resource-level access control available today.
  • Enforce guardrail usage at the SDK or infrastructure layer, not at the Bedrock resource level. Build internal wrapper libraries or IaC templates that always include the guardrail parameters, and validate compliance by parsing CloudTrail logs for guardrail-less API calls. Do not rely on developer discipline alone.
  • Classify all data before feeding it to Bedrock knowledge bases. Since resource policies do not exist, the only way to limit exposure is to ensure sensitive data (insider information, PII, regulated records) is excluded from ingestion pipelines at the source — before it reaches the knowledge base. The insider trading case study demonstrates the real-world consequences of skipping this step.

Common Pitfalls

  • Assuming Bedrock guardrails are enforced by the platform. They are not. A developer who omits the guardrail parameters in an API call bypasses all content filtering silently — no error, no warning. Without wrapper enforcement, guardrail adoption is voluntary and unverifiable.
  • Relying on CloudWatch conversation logging for per-deployment forensics. Because logging is region-wide rather than scoped to a specific agent or chatbot, all conversations across all models land in one log group. Without additional metadata tagging in your application layer, attributing a specific conversation to a specific deployment or user is extremely difficult.

Azure AI Security Controls: Network Isolation, Access Patterns, and Policy Enforcement

Azure AI Security Architecture: Starting with Shared Responsibility

Before configuring a single resource, security engineers need to internalize Microsoft’s shared responsibility model for Azure AI services. Just as it was for cloud infrastructure a decade ago, the boundary between platform-managed and customer-managed security is non-trivial — and AI workloads don’t simplify that boundary, they extend it. As a consumer of Azure AI, your team owns responsibility for how models are hosted, how data flows, and how access is governed. The platform does not do this for you.

The core insight: Azure defaults are cost-optimized, not security-optimized. Every control described below requires an explicit opt-in. App dev teams provisioning Azure AI Foundry[10], Azure ML Workspace, or Azure AI Hub via Terraform, Bicep, ARM templates, or even click-ops will skip these settings unless your team has enforced them at the policy layer.

Layer 1: Network Deployment — Managed Networks and Private Endpoints

The first line of defense for Azure AI security controls is network isolation. When deploying Azure ML Workspace or Azure AI Hub, you must explicitly select a Microsoft-managed network configuration. This is not the default. Without it, your ingress and egress controls are unenforced — any traffic can flow in or out of the service boundary without governance.

Once managed networking is enabled, two key configurations become possible:

  • Ingress/egress governance: You can define exactly which services the AI workspace can communicate with — whether that’s the Azure portal, partner APIs, or internet endpoints. Granular firewall rules (visible in the AI Hub network settings) let you allow specific destinations like Python package registries while blocking everything else.

  • Private endpoints for support services: Azure AI deployments depend on a set of support services — Azure Container Registry (ACR), Azure Storage Accounts, Azure Key Vault[11], and databases. By default, none of these are private. Every one of them needs a private endpoint or private link service endpoint configured so that only the ML workspace and authorized users can reach them. Without this, anyone with network access to the environment can potentially reach your storage or key vault.

At the edge of your application stack, Azure Front Door[12] or another WAF (Web Application Firewall) solution should protect your AI API endpoints from external threats before requests reach the AI services layer. Combined with Azure Firewall for outbound traffic governance, this creates a two-layer network boundary — one at ingress, one at egress — that prevents your models from making unauthorized outbound calls.

Layer 2: Data Access Patterns — Private Connections and Entra-Based Authentication

Once the AI workspace itself is isolated, the next challenge is getting data to it securely. Models need training data, fine-tuning datasets, and knowledge base content — and that data typically lives in SQL databases, storage accounts, or partner platforms like Snowflake.

The correct pattern here is to establish private connections from Azure ML or Azure AI Hub to each data source:

  • Cross-tenant, cross-subscription, or partner service connections should use Azure Private Link or service endpoint connections so that data never traverses the public internet.
  • Each connection should be independently revocable — if something goes wrong with a data source or a pipeline is compromised, you want the ability to break that specific connection without tearing down the entire workspace.

For authentication to these data sources, Microsoft Entra-based identity (formerly Azure AD) is the right model. The ML workspace itself has a managed identity — enable it, then use that identity to authenticate against SQL databases, storage accounts, and Snowflake where supported. This eliminates the need for static credentials (keys, connection strings hardcoded into deployment templates or notebooks) that create long-lived, unrevocable access paths and violate least-privilege principles.

For data scientists and researchers who need to access the workspace directly — to run workbooks, configure deployments, or test models — the recommended access pattern is VPN-based access into a private tunnel to the AI workspace. Direct public internet access to the workspace should not be available.

Layer 3: Azure Monitor — Logging Every Action

Logging in Azure AI is handled through Azure Monitor[13], and it requires explicit configuration. The recommended setup includes:

  • Diagnostic settings on the ML workspace and all connected services
  • Activity logs capturing control-plane operations (who deployed what, when)
  • Insights for service-level telemetry

This matters operationally, not just for compliance. If an ML workspace is querying an ACR registry or pulling data from a storage account in unexpected ways, Azure Monitor is where you find that. It also provides the audit trail needed to prove (or disprove) that access controls are working as intended.

This is one area where Azure’s approach is more operationally useful than AWS Bedrock’s CloudWatch conversation recording: Azure Monitor’s diagnostic settings are per-resource, giving you targeted visibility into each workspace’s behavior rather than a region-wide log group that commingles traffic from all AI deployments.

Data Residency Risk: Global Standard vs. Standard Model Deployments

One of the less-obvious risks in Azure AI security controls is data residency. When deploying a model from the Azure Marketplace, you’ll encounter deployment options including “Global Standard” and “Standard” (region-specific). Microsoft’s own documentation explicitly warns that when using Global Standard, data may be processed outside your selected geography.

For organizations handling sensitive data — healthcare, finance, or any regulated industry — this is a compliance issue, not just a preference. If you’re operating under HIPAA, PCI-DSS, GDPR, or similar frameworks, you must select region-specific deployment options and verify that your data never leaves the designated geography. This check is easy to overlook when app dev teams are moving fast and selecting defaults.

Layer 4: Azure Policy — Blocking Insecure Deployments at Scale

The most operationally durable control in the Azure AI security stack is Azure Policy[14]. Unlike documentation, training, or manual review, policies enforce compliance programmatically — blocking non-compliant deployments before they reach production.

Key policy use cases for Azure AI environments include:

  • Blocking insecure model deployments: Policies can deny deployment of any model that doesn’t meet your network isolation or authentication requirements.
  • Restricting marketplace model access: If your organization has a curated set of approved models, policies can block any attempt to deploy untrusted or unapproved marketplace models. This is critical for supply chain security in AI workloads.
  • Preventing non-private network configurations: Policies can enforce that all new workspace deployments must use managed networks and private endpoints, eliminating the risk of a developer clicking through defaults.

Some of these policies are still in preview from Microsoft; others can be customized for your specific tenant requirements. The combination of Microsoft-published policies and organization-specific custom policies creates a policy-as-code layer that scales across teams without relying on individual engineers to remember security requirements.

Content Filtering: The Azure Guardrail Advantage

Within the Azure AI workspace itself, content filters can be configured to define what the model accepts as input and what it returns as output. This is functionally similar to AWS Bedrock guardrails, but with one critical operational difference that directly addresses LLMOps security.

In Azure, when you deploy a model into a deployment, the content filter is specified at the deployment level — it’s hardcoded into that deployment configuration. If an application calls that deployment, it gets those content filters, full stop. Developers cannot bypass the guardrail by simply omitting a parameter on their API call.

This is a meaningful architectural advantage over AWS Bedrock’s current model, where guardrails are an optional parameter on each API call. An Azure deployment-level content filter cannot be silently bypassed by a developer who forgets (or chooses not) to include the guardrail ID in their request.

A custom blocklist feature (currently in preview) extends this further, allowing organizations to define specific terms, topics, or content patterns that the model must never produce — useful for regulated industries where certain types of output (medical advice, insider information, PII) must be blocked regardless of how the prompt is constructed.

Actionable Takeaways

  • When provisioning any Azure AI resource (ML Workspace, AI Hub, AI Foundry), immediately select Microsoft-managed network mode and configure private endpoints for all support services (ACR, Storage, Key Vault). Never accept the default network configuration — defaults are cost-optimized, not security-optimized. Verify this is enforced by Azure Policy so it applies to all future deployments, not just current ones.
  • Eliminate static credentials from all Azure AI data access patterns. Enable the ML workspace's managed identity and configure Microsoft Entra-based authentication against every connected data source — SQL databases, storage accounts, Snowflake, and other partner services. Audit existing deployments for hardcoded keys or connection strings in Terraform templates, Bicep files, and workbook configurations.
  • Implement Azure Policies to block non-compliant AI deployments before they reach production: deny any model deployment that bypasses private networking, block unapproved marketplace models, and enforce content filter requirements at the deployment level. Pair this with Azure Monitor diagnostic settings on every workspace to maintain a full audit trail of control-plane and data-plane activity.

Common Pitfalls

  • Selecting "Global Standard" deployment for marketplace models without realizing this allows data to be processed outside your designated geography. For regulated industries (healthcare, finance), this is a compliance violation — always verify and select region-specific deployment options, and codify this check in your Azure Policy set.
  • Assuming documentation and training will keep app dev teams from shipping insecure Azure AI deployments. The defaults are insecure, and developers under time pressure will select them. Only Azure Policy enforcement — not documentation — reliably blocks insecure configurations at scale. Teams that rely on process controls alone will inevitably have a misconfigured workspace reach production.

MCP, Agentic Pipelines, and the Confused Deputy Problem

The Confused Deputy Problem in Agentic AI Systems

One of the fastest-growing threat surfaces in modern AI deployments is not the model itself — it is the machine identity that runs it. As organizations adopt securing ML and LLMOps deployments practices, agentic bots operating via the Model Context Protocol introduce a class of vulnerability that security engineers have seen before: the confused deputy problem.

The confused deputy problem occurs when a privileged intermediary — a bot, pipeline, or automation — is manipulated into using its permissions on behalf of an unauthorized requester. A common non-AI example is a Terraform pipeline in GitHub: because it holds broad IAM permissions to create, read, update, and delete infrastructure, any actor who can influence that pipeline can potentially leverage those permissions to exfiltrate secrets or modify resources they were never authorized to touch. The same dynamic now applies to every MCP server and agentic bot connected to your environment.

How MCP Originally Introduced Confused Deputy Risk

In its early form, MCP operated with a static, shared permission set. When an MCP server was granted access to an API or data source, every user who interacted with it inherited that full permission set — regardless of their actual access rights. From a security architecture standpoint, this is equivalent to granting every caller the IAM role of the automation itself. The result: an attacker (or even an overly curious internal user) could access data through the MCP layer that they would never be permitted to reach directly.

This was not a theoretical gap. As one of the speakers noted: “How MCP used to work is you gave it a permission and everyone that talked to it got that permission — and that’s, you know, terrifying.”

OAuth 2 Token Forwarding: A Mitigation That Creates New Risk

The MCP project has matured rapidly, and the current direction addresses the static-permission model by implementing OAuth 2 token forwarding. Under this model, when a user or bot interacts with an MCP server, it does not simply inherit the server’s permissions. Instead, it must present an OAuth 2 token that proves its own identity and permission scope. The MCP server then proxies the connection using that token — meaning the downstream API or data source sees the caller’s actual claims, not the server’s.

This is meaningful progress. If a user cannot access a particular dataset directly, presenting their own token through the MCP layer will not grant access to it. The server cannot be used as a confused deputy to escalate the caller’s privileges beyond what the token authorizes.

However, OAuth 2 token forwarding does not eliminate the risk — it shifts it. The critical failure mode is administrators and power users. If a user with broad administrative rights interacts with an MCP server and provides their own token, that bot now operates with administrative credentials. The speaker described this directly: “I have administrative rights to a lot of systems. And if I interface with a bot and it says, ‘Give me a token that proves I’m you’ and I get all my rights — well, now this bot has administrative rights, and that’s probably a bad idea.”

Confused Deputy Attack via MCP Token Forwarding with Admin Credentials

Proof of Concept

  1. Baseline — the original confused deputy problem: The MCP server is configured with a single static permission set (e.g., an IAM role or service account) and every user who sends a request to the bot operates under those same permissions. An attacker who can interact with the bot inherits whatever the bot’s machine identity can do — including accessing sensitive knowledge bases, calling expensive foundational models, or reading data across the environment. No user-level permission check is enforced.

  2. Evolution — OAuth 2 token forwarding introduced: The MCP server is updated to implement token forwarding. When a user (or another agent) initiates a session with the MCP server, the server challenges the caller to present an OAuth 2 token proving their identity and permissions. The MCP server then proxies that token forward to downstream APIs and services, so the data or actions returned are scoped to what that caller is actually authorized to do. A low-privilege user cannot access data they couldn’t reach directly.

  3. Attack scenario — admin user triggers agentic pipeline: A platform engineer or cloud admin (who holds administrative rights to the organization’s AI infrastructure, knowledge bases, or cloud APIs) interacts with an MCP-backed agentic bot. The MCP server prompts them: “Provide a token proving your identity.” The admin authenticates and their OAuth 2 token — carrying full admin claims — is forwarded into the pipeline.

  4. Privilege escalation in the pipeline: The MCP server, or the downstream agent/tool it invokes, now operates with the admin’s token for the duration of that session. Any subsequent actions taken by the agentic pipeline — querying knowledge bases, calling APIs, modifying configurations, accessing secrets — execute under the admin’s authorization scope. The pipeline can perform actions the admin could do directly, even if those actions were never intended to be delegated to an automated process.

  5. Why this is a confused deputy outcome: The MCP server is the “deputy” — an intermediary that holds or forwards a privileged credential on behalf of the initiating principal. Because the token carries the full permission set of the admin rather than a scoped-down subset, the bot effectively acts as an admin for that session. Any prompt injection, malicious tool output, or logic flaw in the pipeline during this window can be exploited to perform privileged operations.

  6. Missing control — privilege pruning at the MCP layer: The correct mitigation requires rights pruning on the forwarded token: the MCP server (or the OAuth 2 token issuance process) should reduce the scope of the token before forwarding it downstream, granting only the minimum permissions required for the specific task the pipeline is executing. This is either an MCP-layer function or an OAuth 2 token issuance control, but neither is yet standardized or widely implemented.

  7. Additional validation gaps: Beyond scope reduction, remote MCP servers must also: validate that tokens are genuine and originated from the organization’s own identity provider (not forged external tokens), verify that the token holder actually has rights to the specific data or API being accessed, and confirm that the caller is an internal principal rather than an external or unauthenticated entity. Without these controls, the token-forwarding model replaces one confused deputy risk with another.

Privilege Pruning: The Missing Layer

MCP confused deputy attack and privilege pruning diagram

OAuth 2 token forwarding alone is insufficient for agentic pipelines that may be triggered by privileged users. Security engineers building or auditing these systems need to implement privilege pruning at the MCP layer — that is, deliberately constraining what rights a token may exercise when passed through an agentic pipeline, regardless of what that token technically permits.

The principle is the same as least-privilege IAM scoping for service accounts: even if a human user legitimately holds broad permissions, the bot acting on their behalf should receive only the minimum scope needed for the specific task at hand.

Concretely, MCP servers require three security controls to operate safely:

  1. Token validation: Verify that presented tokens are genuine, originated from your network, and carry valid claims before acting on them. Tokens from unverified sources should be rejected outright.
  2. Scope verification: Confirm that the token actually authorizes the specific data or API access being requested — do not rely solely on token authenticity.
  3. Privilege pruning: Enforce a maximum permission ceiling for bot-executed actions that is lower than the caller’s full permission set, particularly for any action that is irreversible (writes, deletes, data exports).

Agent-to-Agent (A2A) Protocols and Compounding Risk

Beyond individual MCP servers, the agent-to-agent (A2A) protocol — now under the Linux Foundation — introduces a second layer of confused deputy risk. When one agentic bot triggers another, the permission and identity claims propagate through the chain. If the originating agent was granted elevated permissions, those may compound through each hop unless each agent in the pipeline independently enforces privilege pruning.

This is a compounding risk that security teams have limited visibility into today. Most organizations cannot currently enumerate all the agentic pipelines running in their environments, let alone audit the identity propagation across them. The A2A protocol’s rapid adoption trajectory means this will require dedicated identity governance tooling within the next year.

Actionable Takeaways

  • Audit every MCP server in your environment for its effective permission model: does it use static shared credentials, or has it been upgraded to OAuth 2 token forwarding? For any server still using static credentials, treat it as a confused deputy risk and either migrate it or restrict the accounts it has access to immediately.
  • Implement privilege pruning as a hard architectural requirement for any new MCP server or agentic pipeline: define a maximum permission ceiling for bot-executed actions that is lower than the triggering user's full permission set, and enforce this at token issuance rather than relying on downstream APIs to reject over-privileged requests.
  • Establish machine identity governance for your agentic bots: enumerate all bot/service accounts, document their permission scopes, integrate them into your IAM review process, and ensure CloudTrail (AWS) or Azure Monitor logging captures actions taken by these identities so that anomalous behavior — a bot suddenly making bulk data reads or cross-account calls — is detectable.

Common Pitfalls

  • Assuming OAuth 2 token forwarding fully solves the confused deputy problem. It prevents privilege escalation for low-privilege callers, but it exposes a different risk when high-privilege users (administrators, power users) interact with MCP servers — their tokens grant the bot their full administrative rights. Without explicit privilege pruning, OAuth 2 token forwarding is necessary but not sufficient.
  • Treating agentic bots as system processes rather than as user-class identities. Because bots are automated and not associated with a named human, they often escape the IAM review cycles applied to human accounts — accumulating broad permissions over time through developer convenience rather than through deliberate access grants. Every bot should go through the same least-privilege scoping review as a service account.

Shift-Left Security for AI: Governance, Classification, and Developer Enablement

The Default Platform Engineering Posture Is Reactive — and It Doesn’t Scale

If your platform engineering team is not involved until after an AI feature ships, your daily workflow looks like this: a developer adds a new AI capability, you find out, and you tell them to turn it off. That’s not securing ML and LLMOps deployments — that’s playing whack-a-mole at scale. The architecture talks and access control configurations in the preceding sections only matter if security is embedded into the AI development lifecycle before the feature goes live.

The presenters are explicit: shift-left is not optional for AI. Just as DevSecOps moved security earlier in the software delivery lifecycle, AI security must follow the same pattern. If you’re not integrated as far left as you can be in AI feature development, every shipped feature is a potential data governance gap waiting to be discovered.

Why AI Features Are a Special Shift-Left Challenge

AI features are categorically different from conventional software features in one critical way: they act as data funnels. Any data an AI system can access, it will give to any user who asks — politely, fluently, and convincingly. There is no SQL query that limits what the model will reveal. There is no access control at the inference layer unless you explicitly build one.

This means the classification question — is this data appropriate for AI ingestion? — must be answered before the knowledge base is built, before the fine-tuning dataset is assembled, and before the RAG pipeline is connected to a data source. Answering it after deployment is, at best, remediation. At worst, it’s regulatory liability.

The presenters note that most organizations are treating AI engineering as a kind of data engineering problem. That framing is useful: you need to know where your data is, how it’s classified, who can access it, and how it should move across the network — the same disciplines that underpin data warehouse security. The difference is that the AI layer removes the structured query abstraction. The model will synthesize, infer, and surface information in ways that no schema or row-level security policy anticipates.

The Insider Trading Case Study: A Data Classification Failure in Production

The most concrete illustration of this risk comes directly from the presenters’ own organization. An internal chatbot was built on a knowledge base that ingested company data broadly — product launch schedules, internal timelines, delay notifications. No malicious actor was involved. No attacker exploited a vulnerability. The system worked exactly as designed.

The problem: when users asked the chatbot when they should purchase stock in the publicly traded company, it answered accurately. It had ingested enough forward-looking internal information — product launch dates, delays, publication schedules — that it could effectively synthesize material non-public information on demand. That’s insider trading, regardless of whether the intent was to enable it.

The fix was implementing a Bedrock guardrail topic block for stock purchase guidance. That works as a compensating control. But the root cause was not the absence of a guardrail — it was the absence of data classification before ingestion. The question that should have been asked before any data was fed into the knowledge base was: does this information constitute material non-public information? If yes, it should not have been ingested at all, or should have been scoped to a separate, access-controlled deployment.

This case study defines the organizational process gap: data governance must precede AI system design, not follow it.

Enforcing Guardrails Developers Cannot Bypass

Even when guardrails are correctly configured, the enforcement model matters. On AWS Bedrock, guardrails are passed as an optional parameter on each API call. Developers can simply omit the parameter. The guardrail exists, is correctly configured, and is completely irrelevant to any request that doesn’t include it. There is no mechanism to mandate its use at the platform level.

This is a structural difference from Azure’s content filter model, where content filters are attached to the model deployment itself. If the deployment has a content filter configured, every request through that deployment is subject to it — no parameter required, no developer opt-out possible.

For organizations running on AWS Bedrock today, the compensating controls are:

  • IAM-enforced API access: limit which principals can call Bedrock APIs and monitor via CloudTrail for calls that lack guardrail parameters
  • Application-layer enforcement: build the guardrail parameter into your Lambda worker or application code, and treat its omission as a pipeline error
  • Code review gates: include guardrail parameter presence as a mandatory check in PR review for any code touching Bedrock APIs

The broader organizational lesson: security controls that developers can opt out of are not security controls — they are suggestions. Platform security teams need to evaluate, for every AI security mechanism, whether it can be bypassed by an application developer who either doesn’t know or doesn’t care. If it can, the control must be wrapped in something that removes that optionality.

Shift-Left Integration: What It Looks Like in Practice

Shift-left for AI security means inserting security review at the point where AI features are conceived, not where they ship. In practice, this requires:

1. Data classification as a gate on AI feature approval. Before any AI feature can proceed to implementation, the data it will ingest or access must be classified. The relevant questions: Is this PII? Is this regulated data (PHI, financial data)? Is this material non-public information? Does the classification permit it to be surfaced to all users who can access the AI endpoint, or only to a subset? This review should happen at the feature design stage, not after the knowledge base is built.

2. Access scoping as a design requirement, not an afterthought. AI systems should be provisioned with the minimum data access required for their intended use case. If the chatbot is meant to answer HR policy questions, it should have access to the HR policy knowledge base — not the payroll database, not the product launch calendar, not the financial reporting store. Every data source added to an AI system’s access scope should require explicit justification and review.

3. Security team integration into AI feature development sprints. The platform engineering team should be a standing participant in AI feature development — not a gatekeeper consulted at the end, but a collaborator involved at the design stage. This is the same model that DevSecOps applied to infrastructure and deployment pipelines. AI features are not categorically different in terms of the organizational change required.

4. Developer enablement through secure defaults. Rather than relying on developers to correctly configure guardrails, content filters, network isolation, and access controls, platform teams should provide pre-hardened deployment templates, Terraform modules, or infrastructure-as-code blueprints that encode secure defaults. The Azure Policy approach — blocking non-compliant deployments at the platform level — is the most robust implementation: developers cannot accidentally skip security controls if the deployment fails without them.

The Cost of Not Shifting Left

The presenters close with a direct observation: if you are not aware that your developers are building AI services in your organization today, you need to become aware of it — because it is happening. AI tooling is cheap, accessible, and integrates easily with existing cloud infrastructure. The barrier to shipping an AI feature is lower than it has ever been for application development teams.

That accessibility is what makes the shift-left imperative urgent. Every AI feature that ships without security review is a potential unclassified data exposure, a potential guardrail bypass, a potential confused deputy liability. The reactive model — discover, audit, shut down — does not scale to the pace at which AI capabilities are being embedded into enterprise applications.

The security engineer’s role in AI is not to block adoption. It is to make secure adoption the path of least resistance. That means being present early, providing opinionated secure tooling, and ensuring that the organizational processes that govern what data goes into AI systems are established before the first knowledge base is built — not after the first insider trading risk surfaces in production.

Actionable Takeaways

  • Establish a data classification gate for AI features: require that any data source intended for AI ingestion be reviewed for sensitivity, regulatory status, and appropriate user scope before the knowledge base or training dataset is built. The insider trading case study shows the cost of skipping this step — classify first, ingest second.
  • Audit every AI security control in your stack for developer bypass potential. On AWS Bedrock, guardrail parameters are optional on each API call; wrap them in application-layer enforcement (Lambda worker code, PR review gates, CloudTrail alerting on guardrail-less calls) so that omitting them is treated as a pipeline error, not a permissible shortcut.
  • Embed security into AI feature development sprints rather than deploying it as a post-ship audit. Provide developers with pre-hardened infrastructure templates (Terraform modules, Bicep templates, Azure Policy definitions) that encode secure defaults for network isolation, content filtering, and IAM scoping — making the secure path the easy path.

Common Pitfalls

  • Treating guardrails and content filters as sufficient compensating controls for unclassified data ingestion. Guardrails block specific outputs; they do not prevent the AI system from ingesting and retaining sensitive data in its knowledge base. The insider trading example illustrates this precisely: a topic guardrail was added after the fact, but the root cause was ingesting material non-public information into the knowledge base in the first place. Classification must precede ingestion.
  • Operating a reactive security posture where platform engineering only learns about new AI features after they ship. At that point, the choices are "turn it off" or "accept the risk." Neither is a sustainable model at the pace AI features are being developed. Security teams that are not integrated into AI feature planning from the start will perpetually be remediating insecure deployments rather than preventing them.

Conclusion

Securing ML and LLMOps deployments is not a single configuration task — it is a continuous discipline that spans cloud platform controls, identity architecture, data governance, and organizational process. The two central findings from this talk are worth internalizing clearly: on AWS Bedrock, the platform’s current resource model means that guardrails are bypassable, resource policies don’t exist, and your primary enforcement levers are IAM principal scoping and CloudTrail-based detective controls. On Azure AI, the defaults are insecure-by-cost-optimization and must be explicitly hardened through managed networking, private endpoints, Entra-based identity, and Azure Policy enforcement.

Beneath both platforms lies the same foundational gap: AI systems act as indiscriminate data funnels, and agentic bots with broad permissions act as machine-speed insiders. The insider trading case study is a production example of this risk, not a theoretical one. The confused deputy problem in MCP pipelines is an identity governance challenge that most organizations are just beginning to model.

Security engineers working in this space should prioritize three things: classify data before it reaches any AI system, enforce guardrail and access controls at the infrastructure layer (not the application layer), and treat machine identities with the same IAM rigor applied to privileged human accounts.

For further reading on adjacent topics covered across The Cyber Archive, explore AI/ML security for model-layer attack and defense patterns, RAG security for knowledge base architecture considerations, and cloud security for IAM and network hardening patterns across AWS and Azure environments.


References & Tools

  1. Snowflake — Cloud data warehouse platform commonly used as an enterprise data source for AI model training and RAG pipelines.
  2. Model Context Protocol (MCP) — Open protocol enabling agentic bots to interact programmatically with APIs and tooling; discussed as the primary surface for confused deputy risk in agentic pipelines.
  3. AWS Bedrock — Managed AWS service providing serverless access to pre-trained and fine-tuned foundational models, RAG knowledge bases, and optional guardrail-based content filtering.
  4. AWS SageMaker — AWS service for pre-training and hosting custom ML models; the higher-cost alternative to Bedrock for organizations requiring full model training control.
  5. AWS Lambda — Serverless compute used in the Bedrock architecture as both a webhook receiver and an asynchronous Bedrock processing worker.
  6. OpenSearch / Aurora (pgvector) — Vector database backends for AWS Bedrock knowledge bases. OpenSearch is natively supported (~$40/day); Aurora with pgvector is ~40x cheaper but not yet a native Bedrock knowledge base data source.
  7. AWS Bedrock Converse API — Meta-API that normalizes requests across all Bedrock foundational models and adds document ingestion support (PDF, Excel, PowerPoint) not available via direct model APIs.
  8. AWS CloudTrail — Primary audit logging mechanism for Bedrock API calls; records which IAM principal called which Bedrock verb — essential for detecting unauthorized usage and monitoring for guardrail bypass.
  9. AWS CloudWatch — Provides usage dashboards and conversation recording for Bedrock; limited by region-wide rather than per-deployment logging scope.
  10. Azure AI Foundry / Azure ML Workspace — Azure's primary platform for deploying and managing AI/ML models; the central resource around which network isolation, RBAC, managed identity, and policy controls are configured.
  11. Azure Key Vault — Recommended for storing certificates, secrets, and keys used by AI workspaces rather than hardcoding credentials in deployment templates.
  12. Azure Front Door / WAF — Edge-layer ingress/egress control for Azure AI applications, protecting API endpoints from external threats before requests reach AI services.
  13. Azure Monitor — Provides diagnostic settings, activity logs, and per-resource insights for Azure AI services; the recommended logging framework for auditing all ML workspace activity.
  14. Azure Policy — Used to enforce compliant AI deployments at scale by blocking insecure model deployments, restricting marketplace model access, and preventing non-private network configurations.
Frequently asked

Questions from the audience

Can AWS Bedrock guardrails be bypassed by developers?
Yes. AWS Bedrock treats guardrails as optional parameters on each API call — the guardrailIdentifier and guardrailVersion fields. A developer who omits these parameters receives unfiltered model output with no platform-level enforcement stopping them. The only compensating controls are CloudTrail alerting on guardrail-less calls, internal SDK wrappers that inject guardrail parameters automatically, and code review gates requiring guardrail parameter presence.
What is the confused deputy problem in agentic AI, and how does OAuth 2 token forwarding address it?
The confused deputy problem in agentic AI occurs when an MCP server or agentic pipeline holds (or forwards) permissions beyond what the triggering user should actually exercise. OAuth 2 token forwarding improves the original MCP design — where all users inherited a static shared permission set — by requiring callers to present their own identity tokens. However, it does not eliminate the risk: high-privilege users (administrators) who interact with MCP servers pass their full administrative token to the bot. Privilege pruning at the MCP layer is required to cap what rights the bot may exercise regardless of the token presented.
How does Azure AI content filtering differ from AWS Bedrock guardrails?
Azure AI content filters are configured at the model deployment level and cannot be bypassed per request — every application calling that deployment inherits the filter. AWS Bedrock guardrails, by contrast, must be explicitly included as parameters on each API call. A developer who omits the guardrail parameter receives unfiltered output. This architectural difference makes Azure's content filtering a stronger enforcement mechanism; AWS has indicated resource-level policies are on their roadmap but not yet available.
What data classification controls should precede AI knowledge base ingestion?
Before any data source is connected to an AI knowledge base or RAG pipeline, it must be reviewed for sensitivity (PII, PHI, financial data), regulatory status (HIPAA, GDPR, PCI-DSS), and materiality (material non-public information under securities law). The user scope of the AI endpoint must match the access restrictions on the underlying data — if only a subset of users should see certain information, the AI system serving a broader audience must not ingest it. System prompt instructions ('don't share this data') are not access controls and do not substitute for classification-gated ingestion.
Watch on YouTube
Securing organizations ML & LLMops deployments : A platform architects journey onboarding LLM & MLops tools and securing multi-cloud data access
Sai Gunaranjan, Kyler Middleton, · 36 min
Watch talk
Keep reading

Related deep dives