Why do blanket AI prohibitions in large enterprises always fail?

Blanket prohibitions eliminate sanctioned AI use but do not eliminate AI use itself. The 5% of power users who generate the most operational value from AI tools will route around bans using consumer platforms — often with weaker security controls and no audit trail. The prohibition creates a worse security posture than a carefully scoped sanctioned deployment would.

What causes the token quota collapse problem for security teams?

Uniform token allocation policies treat a forensics analyst running 40+ pages of log comparison in the same tier as a general office user drafting a two-paragraph email. These workloads differ by an order of magnitude in token consumption. Without tiered allocation designed before launch, power users exhaust their quotas in one or two sessions and revert to unofficial consumer tools.

Is fine-tuning models for security-specific tasks worth the investment?

Based on the Army's experience, fine-tuning delivers diminishing returns in fast-moving model markets. Procurement, accreditation, and deployment cycles are long enough that the base model used for fine-tuning is typically superseded before the specialized model ships. Bespoke workflow tools wrapping frontier models deliver more durable value and inherit model improvements automatically.

What is the silo reflex and why does it matter for AI adoption?

The silo reflex is the organizational tendency to assign a single specialist as the designated expert for any new technology, routing all questions and requirements through that one person. For AI, siloing creates a structural bottleneck where operational knowledge about what analysts actually need stays locked in individual chat histories — invisible to procurement teams writing the next enterprise agreement. The window for treating AI as a niche specialization is closing.

Enterprise AI Adoption in Cybersecurity: 3 Phases

Enterprise AI adoption in cybersecurity rarely fails on technical merit — it fails on procurement cycles, token budgets, and cultural inertia. At the US Army, one of the world’s largest and most complex enterprises, a Lieutenant Colonel spent two years trying to drag a million-person workforce into using AI chatbots and largely came up short. The result is a field report that no vendor slide deck will ever show you.

For security engineers navigating AI rollouts in large organizations, the lessons from military AI deployment cut straight to the practical barriers you’ll face: who gets access, who controls cost, and how you change behavior at scale. This post breaks down all three phases and what it actually takes to move from shadow tool use to enterprise-wide AI fluency.

Key Takeaways

You'll understand the three sequential barriers to enterprise AI adoption — access, cost, and culture — and why solving them in order is the only viable path for large organizations.
You'll be able to identify why generic AI token allocations fail power users like forensics analysts, and how to advocate for tiered usage policies before rolling out AI to your security team.
Apply the silo reflex framework to assess whether AI belongs as a specialized function or as a broadly distributed capability in your organization — and what the cost of getting that wrong looks like.

Phase 1: The Access Problem and Shadow AI Use

The Blanket Ban: How Enterprise AI Adoption in Cybersecurity Starts

When AI security first emerged as an organizational challenge in 2023, the US Army’s response was representative of large-enterprise reflexes everywhere: block it entirely. Space Force issued the first official memo — no enterprise use of AI, period — and most other service branches followed informally or formally. The intent was to control risk. The effect was to split the workforce into two groups with opposite behaviors.

The 95% majority — personnel who were already indifferent to AI, worried it would replace their jobs, or opposed on general principle — were completely unaffected. They were happy to comply with a ban on something they weren’t using anyway.

The 5% power users — technically curious analysts, incident responders, forensics personnel — kept experimenting. They just moved the work off the network.

Shadow AI Use: The Predictable Organizational Response

For organizations deploying AI prohibitions without sanctioned alternatives, shadow AI use is not a failure of compliance culture — it is a rational response to a capability gap. Lt. Col. Hasbrouck describes the pattern directly: personnel would leave work, go home, open Google AI Studio^[1] or ChatGPT^[2], and query AI systems with questions adjacent to their active work. No classified or sensitive data was transmitted, but operational knowledge — threat analysis questions, capability assessments, analytical framings — was flowing through unsanctioned consumer platforms.

This is a significant governance and compliance risk that many enterprise security leaders underestimate. Shadow AI use creates:

Data residency uncertainty: Even if no sensitive data is explicitly shared, the queries themselves can reveal organizational priorities, investigative focus areas, and capability gaps.
Audit gaps: There is no logging, monitoring, or retention of what questions were asked or what outputs informed decisions.
Inconsistent outputs: Different team members using different consumer tools with different prompting practices produce inconsistent analytical quality with no baseline for evaluation.

The Army’s experience confirms a well-established pattern in enterprise technology adoption: if you prohibit a tool that provides real productivity value without offering a sanctioned alternative, your highest-performing employees will find workarounds. Shadow AI use in 2023 was not a sign of bad actors — it was a signal that the official stance was out of step with operational reality.

The First Sanctioned Tool: Camo GPT

By late 2023, a sanctioned alternative arrived. Camo GPT^[3] — the Army’s internally accredited AI chatbot — completed its Authority to Operate (ATO) cycle and was cleared for use, including with Controlled Unclassified Information (CUI). For power users like Hasbrouck’s team, this was a genuine step forward: they could now query the tool with operationally relevant data without the legal and policy risk of using consumer platforms.

The significance for security teams planning LLM deployment is the ATO cycle itself. Even in a case described as completing “in record time,” the timeline ran from mid-2023 interest through late 2023 accreditation — months of procurement and security review. For organizations outside the military, equivalent processes (vendor security reviews, data processing agreements, compliance assessments) impose similar friction and timeline pressure.

The ATO completion resolved the access problem — but immediately exposed the next constraint layer. Having a sanctioned tool is necessary but not sufficient. The quality, features, and availability of that tool determine whether power users actually adopt it or continue working around it.

Camo GPT: The Army’s First Accredited AI Chatbot and Why It Fell Short

Proof of Concept

Pre-deployment context — shadow AI use as the baseline: Before Camo GPT, Army Cyber personnel with legitimate operational questions — including forensics and malware analysis staff — were running queries at home on consumer tools such as ChatGPT^[2] and Google AI Studio^[1]. No sensitive or classified data was passed, but the workflow was: ask the question on a personal device against a frontier model, mentally sanitize the response, then carry the synthesized answer back to the classified or CUI environment. This shadow pattern was a direct symptom of zero approved enterprise tooling.
The ATO milestone and what it was supposed to solve: Camo GPT was accredited for use with CUI under the Army’s authority-to-operate process. Personnel could now query the tool with operationally sensitive context — threat data, log summaries, incident details — without first sanitizing information down to open-source-safe equivalents. The ATO cycle completing “in record time” in late 2023 was considered a major procurement and accreditation win.
Failure mode 1 — Model quality gap: Camo GPT was built on Llama 2 70B. At the time of deployment, power users inside Army Cyber were personally using GPT-4 Turbo and Gemini Pro on consumer accounts. The capability delta was significant enough that technical staff immediately rejected it as a step backward. For the 95% of the workforce that had no prior AI engagement, model quality was irrelevant. For the 5% of power users driving internal adoption, the quality regression was disqualifying.
Failure mode 2 — Missing UX primitives (document parsing and web browsing): The two features that made ChatGPT and Gemini practically useful for security operations workflows were absent from Camo GPT: document parsing (native PDF/file ingestion) and web browsing (live context retrieval). The consequence for a general user attempting a basic workflow — extracting indicators from a PDF threat report — was: open the PDF manually → copy all text content → paste into a plain text file (the input field accepted only approximately 400 characters, so clipboard paste was unusable) → upload the text file as an attachment. The UX friction at this step eliminated the vast majority of general users before they ever reached a useful query. Web browsing absence meant no live threat intelligence retrieval capability.
Failure mode 3 — Availability and latency instability: The Camo GPT infrastructure was hosted on a GPU server farm without enterprise-grade cloud SLA backing. This produced availability timeouts under normal use and RAG query response latency measured in hours rather than minutes when teams tested document-augmented queries at any meaningful scale.
Net outcome — tool existed on paper, shadow use continued in practice: Despite successful ATO accreditation, Camo GPT’s operational adoption rate among technical staff remained negligible. The tool satisfied the organizational requirement of having an “approved AI chatbot” without satisfying the actual operational requirement of being a useful one. Army Cyber personnel continued using AI Studio^[1] and other consumer tools informally — a direct continuation of the shadow AI use pattern that Camo GPT was intended to eliminate.

What This Phase Teaches Security Engineers

The access problem phase follows a predictable arc in any large organization:

A new capability appears that demonstrably accelerates high-value work.
A blanket prohibition is issued before governance frameworks exist.
Power users route around the ban using consumer tools, creating shadow use patterns.
A sanctioned tool is eventually accredited, resolving the access problem but revealing the next constraint layer.

Actionable Takeaways

Audit shadow AI use before assuming your organization's AI ban is being observed. Survey your highest-performing analysts and incident responders about which consumer AI tools they are using outside of sanctioned channels. The answers will tell you more about your capability gaps than any vendor assessment.
When evaluating enterprise AI tools for security operations, treat the ATO or equivalent compliance review timeline as a first-class planning constraint. For large regulated enterprises, expect six months to two years from vendor selection to sanctioned deployment — plan your shadow use mitigation strategy accordingly.
Document the operational use cases that are driving shadow AI adoption (e.g., log analysis, threat research, report drafting) before pursuing tool procurement. These use cases become your minimum viable feature requirements and prevent you from deploying a sanctioned tool that power users immediately reject because it lacks document parsing or web browsing.

Common Pitfalls

Assuming a blanket prohibition on AI tools eliminates AI use within the organization. In practice, it eliminates sanctioned AI use while driving power users to consumer platforms with weaker security controls and no audit trail. The prohibition creates a worse security posture than a carefully scoped sanctioned deployment would.
Treating ATO completion or compliance accreditation as the end of the access problem. Access is a necessary precondition, not a sufficient one. If the accredited tool lacks the features that power users actually need (document parsing, sufficient context window, web browsing), shadow use will continue in parallel with the sanctioned tool — giving the organization the illusion of compliance while the underlying capability gap persists.

Phase 2: Inadequate Tooling and the Token Cost Trap

From Shadow Use to Sanctioned Tools — and Back Again

The first round of officially approved AI in the Army came in late 2023 in the form of Camo GPT^[3], an internally accredited chatbot built on Llama 2 70B. After months of shadow use on personal devices, having a sanctioned tool that could accept Controlled Unclassified Information was a genuine breakthrough. No more “going home to ask a question, then translating the answer back into work context.” The problem was that the tool was immediately insufficient for power users and impractical for general users — for three distinct reasons.

Failure Mode 1: Model Quality Gap

For personnel who had been running GPT-4 Turbo or Gemini Pro at home, stepping down to Llama 2 70B represented a significant capability regression. The model was competent for its time, but the delta was large enough that the same users who had been doing shadow work simply continued doing shadow work — now with a sanctioned alternative sitting unused. This created a dual-tool environment where official adoption numbers looked low not because people didn’t care about AI, but because the enterprise tool couldn’t meet the bar already set by consumer-grade alternatives.

The key lesson for large language model deployment in enterprise security: model version matters enormously to power users. If your sanctioned tool is six months behind the frontier, your technical staff will find workarounds — and those workarounds will be less auditable than official tool use.

Failure Mode 2: Missing Front-End Capabilities

The more broadly damaging issue was not model quality but missing interface features — specifically document parsing and web browsing. These are capabilities the general user population relies on to make AI actually useful in their day-to-day workflows:

Document parsing: A typical use case is “extract information from this PDF.” Without native document ingestion, the workflow degrades into: open the PDF, copy-paste all text into a plaintext file, upload the file — except that the input box only accepted approximately 400 characters, so the text file upload was the only viable path. By the point a general user had to do all of that, the majority had already abandoned the tool.
Web browsing / internet search: With no ability to pull live context from the internet, the tool was limited to static general knowledge retrieval. For incident response, threat intelligence lookups, or referencing current CVEs, this made the tool effectively useless for real-time operational tasks.

These are not edge cases — they are the two most common ways general users interact with AI tools. When the enterprise platform lacks both, adoption stays in the single digits outside the power user tier. Security teams evaluating AI tools for security operations must treat document ingestion and live search as baseline requirements, not premium features.

Failure Mode 3: Availability and Performance

The third dimension of failure was reliability. The self-hosted infrastructure behind Camo GPT was run on a best-effort basis — GPU capacity in a server farm somewhere, not backed by the kind of elastic cloud compute that commercial providers use. The consequences:

Availability timeouts under normal usage load
Multi-hour query response times for anything resembling RAG-style workloads (e.g., embedding and querying a document corpus)
No SLA or guaranteed uptime for operational use

For a forensics analyst trying to accelerate an active investigation, a tool that might respond in hours provides no operational value. The enterprise tool was effectively a research preview — usable for low-stakes experimentation, not for production security workflows.

The 2024 Platform Upgrade — and the Token Cost Trap

In 2024, a new platform arrived: commercial API-backed, frontier model access, reliable infrastructure. This solved the three failure modes above. The problem had moved. Now it was a cost and allocation problem.

The platform was sold to the Army at approximately 10x the standard commercial API rate (compared to what you’d pay through AWS or Azure). This alone would be manageable if allocation had been structured intelligently. It was not. For bureaucratic and political reasons, every user received a uniform token allocation — no distinction between a general office worker drafting emails and a forensics analyst running deep log analysis.

The practical outcome was immediate and predictable:

A general user asking the model to “summarize this two-page memo” or “draft this three-paragraph email” would burn through their allocation slowly, if at all.
A forensics analyst loading a 20-page log file, then loading a second 20-page log file from three days prior, and asking the model to diff the two for meaningful behavioral anomalies — would exhaust half their monthly quota in a single query.

Token Quota Collapse: How a Forensics Log Diff Query Burned Half a Month’s Allowance

Proof of Concept

Enterprise AI platform deployed with uniform token allocation: The Army acquired a new AI platform backed by commercial APIs (GPT-4-class frontier models). For bureaucratic and political reasons, no distinction was made between general users and power users such as forensics analysts. Every user received the same monthly token budget.
Token pricing applied at a significant markup: The enterprise vendor took the standard API token costs available through AWS or Azure and multiplied them by a factor of approximately 10 to establish their profit margin.
Forensics analyst submits a log-diff query: A member of the forensics and malware analysis team attempted to use the platform for a legitimate, high-value security operations task: comparing a 20-page log file against another 20-page log file from three days prior. The goal was not just a raw diff but an intelligent, meaningful diff — identifying security-relevant changes that a line-by-line comparison tool would miss.
Query construction: The analyst uploaded or pasted the first 20-page log file and then the second 20-page log file into the LLM context, then prompted the model to diff the two and identify not just textual differences but meaningful behavioral or security-relevant changes in the log data.
Token consumption outcome: The combined input context of two 20-page log files, plus the model’s reasoning and output, consumed roughly half the analyst’s entire monthly token allocation in a single query. With the enterprise vendor’s 10x markup applied, this single forensics task was priced at a level that made repeat use economically impossible under the flat quota system.
Quota exhaustion cascade: With half the monthly budget burned on a single query, analysts effectively had no headroom left for additional LLM-assisted forensics work. The team exhausted their tokens by day one or two of the month, leaving the rest of the month without access to the approved platform.
Process for requesting more tokens — effectively impossible: The official remediation path required seven signatures and seven written justifications from high-ranking personnel who did not have sufficient familiarity with LLM token economics to evaluate the requests. In practice, this process was described as effectively impossible to execute within any operationally useful timeframe.
Reversion to shadow tooling: Unable to use the approved enterprise platform after quota exhaustion, forensics personnel reverted to using Google AI Studio^[1] and other unapproved consumer tools — the same shadow behavior that the enterprise deployment was intended to eliminate.
Root cause — no power user tier: The failure was not in the LLM’s capability or the platform’s technical architecture. It was entirely a policy failure: the absence of a tiered token allocation that differentiated between low-volume general users and high-volume technical users with legitimate, repeatable needs for large-context operations.

Token tier allocation diagram showing general users vs security power users and their divergent token consumption patterns

What This Means for Security Teams Deploying AI

The Army’s Phase 2 experience maps directly onto challenges security engineering teams will face when deploying LLM tooling:

Token allocation by role, not by headcount. A uniform token budget treats a SOC analyst running 50-page malware reports the same as an HR coordinator writing meeting summaries. Design tiered allocation before launch, not after.
Evaluate front-end feature parity with consumer tools. If your official enterprise platform can’t ingest PDFs or query the web, your technical staff already knows it’s inferior to what they’re using at home. The adoption battle is lost before it starts.
Infrastructure SLAs matter for operational security use. Best-effort GPU availability is acceptable for experimentation. It is not acceptable for incident response timelines. Require performance guarantees when evaluating AI platforms for security operations workloads.
Procurement markups compound token exhaustion. A 10x markup on API costs isn’t just a budget problem — it shrinks effective capacity by the same multiplier. Negotiate enterprise agreements that reflect actual security team usage patterns, not office productivity benchmarks.

Actionable Takeaways

Before deploying any enterprise AI platform to your security team, map usage personas explicitly: differentiate between general users (email drafting, document summarization) and power users (log analysis, malware triage, large-context forensics tasks). Design separate token allocations for each tier at launch — retrofitting this after deployment requires bureaucratic escalation cycles that may be effectively impossible in large organizations.
Treat document ingestion and live web search as non-negotiable baseline requirements when evaluating AI tools for security operations. If the platform lacks either, measure the real adoption rate among your technical staff against how much shadow use of consumer tools persists — the gap will tell you whether your official platform is actually serving the team.
When negotiating enterprise AI contracts, anchor cost benchmarks to commercial API pricing (e.g., AWS or Azure pass-through rates) rather than accepting vendor-defined enterprise tiers. A 10x markup means your effective token budget is 10% of what an equivalent commercial deployment would provide — which directly limits what power users can accomplish per billing cycle.

Common Pitfalls

Uniform token allocation across all user roles is one of the most predictable failure modes in enterprise AI rollouts for security teams. When forensics analysts, threat hunters, and malware analysts share the same monthly quota as general office users, power users exhaust their allocation in days and revert to unofficial workarounds — creating shadow AI use that is harder to audit than if no enterprise tool had been deployed at all.
Assuming that model availability translates to model adoption. Camo GPT was accredited, available, and technically functional — and still unused by most of the workforce. The gap between "the tool is live" and "the tool is used" is filled by front-end feature parity, performance reliability, and usage policies that reflect real workloads. Launching a platform without validating these dimensions guarantees low adoption statistics and continued shadow tool use.

Phase 3: Enterprise Agreements and the Culture Problem

The Arrival of Proper Enterprise AI Agreements

With the access and cost problems largely resolved, enterprise AI adoption in cybersecurity now faces its hardest phase: organizational culture. For the US Army, Phase 3 arrived with the launch of genai.mil^[4], the DoD-wide AI platform. Unlike the self-hosted API wrappers and inadequate internal tools of earlier phases, genai.mil connects to multiple major commercial AI enterprise platforms and provides access to frontier models through their actual production front ends — the same polished UIs that make those tools useful in the first place.

This is a meaningful distinction. Previous efforts, including Camo GPT and the commercial API-wrapped platform, were essentially custom web layers sitting in front of model backends. The Phase 3 approach hands workers direct access to enterprise-tier tooling that has the document parsing, context handling, and interface quality that power users had been finding on consumer products at home. The technical access gap is effectively closed.

The Silo Reflex: AI as the New “Computer Guy” Problem

Solving the access and cost problems has exposed the real barrier: organizational culture, specifically what Lt. Col. Hasbrouck calls the silo reflex. The pattern is consistent across every new technology a large organization encounters: a new capability arrives, leadership identifies a smart person who understands it, and that person becomes “the box guy” — the designated expert who fields all questions and briefs senior stakeholders on the new thing.

This is a rational starting point. Every major technology transition begins this way. Hasbrouck himself was the “cyber guy” for much of his career, owning the cyber box and answering questions about it. The problem is not the initial silo — it is whether the organization recognizes when a technology has matured beyond the point where siloing it is an acceptable long-term strategy.

The framework he applies is a simple spectrum:

Technologies that can safely stay siloed: Space technology, for example, is something the Army can reasonably leave to specialists. The average soldier’s job does not depend on fluency with orbital mechanics.
Technologies that should have been distributed immediately: GPS and personal computers are obvious cases in retrospect. The organizations that siloed these the longest paid a productivity cost for every month they delayed broad adoption.
Technologies still being resolved: Drones are a live example. The Army is actively debating whether drone capability belongs in a specialized drone corps or should be pushed out to all fighting forces as a foundational skill.

The Silo Reflex: AI as the New “Computer Guy” Problem

Proof of Concept

Recognize the pattern: When a new technology arrives in a large organization, the default institutional response is to assign a single knowledgeable person as the owner — “Joe is the smart guy, he’s the box guy, come to Joe with questions.” The silo reflex is not inherently wrong as a starting point; it is a predictable and sometimes appropriate triage mechanism.
Trace the historical precedents: The speaker walked through several technologies that followed this arc at the US Army:
- Computers: Initially siloed to “the computer guy.” Should have been distributed to the entire enterprise as fast as possible.
- Cyber: The speaker himself was “the cyber guy” for a long time. Cyber is still in flux — it is not yet settled whether cyber belongs as a specialized function or a broadly distributed competency.
- Drones: Currently facing the same structural debate: create a dedicated drone corps of specialists, or push drone proficiency out to all fighting forces?
Apply the framework to AI: The central question for AI adoption is: does AI look more like VR (a niche technology that stays in a specialist box) or more like PCs (a general-purpose capability that must be distributed to everyone)? The assessment as of the talk: the trajectory is clearly tipping toward PCs. AI is not staying in the box.
Identify the organizational cost of getting it wrong: If AI is treated as a specialist domain — one team’s responsibility, one person’s expertise — the rest of the organization never develops fluency. Requirements stay locked in individual chat histories rather than being extracted, formalized, and fed back into enterprise procurement. The knowledge gap widens between power users and the general workforce.
Extract the cultural implication for security teams: For security organizations specifically, the silo reflex around AI is especially dangerous because security engineers are already specialists. The risk is that “AI for security” becomes a second-order specialization within an already specialized domain — a single team running AI-assisted workflows while the rest of the SOC or forensics function operates as before.

Where Does AI Fall on That Spectrum?

The honest answer, as of the time of this talk, is that AI is tipping toward the PC end of the spectrum — more like a broadly distributed capability than a specialty function. The question for security teams right now is whether they are treating AI the way early enterprises treated the internet: as a thing the IT department handles, rather than a tool every knowledge worker needs fluency with.

The cultural challenge is compounded by a practical problem: the institutional knowledge required to even define AI requirements is currently trapped in people’s personal chat histories on consumer LLM platforms. Analysts have been working through problems with AI tools informally, building intuition and workflow knowledge, but none of that is captured in any system the enterprise procurement process can read. The organizational memory of what AI is actually useful for exists in shadow form — scattered across personal accounts, undocumented, and invisible to the people who write requirements for official tools.

The Knowledge Extraction Problem

This creates a structural bottleneck. The operators at the technical level — forensics analysts, malware researchers, incident responders — have learned through trial and error what kinds of queries AI handles well, what context it needs, and where it breaks down. But the procurement officials and requirements writers who determine which tools get funded and fielded do not have access to that knowledge.

Closing this gap requires more than rolling out a platform. It requires a deliberate process of extracting workflow knowledge from the people who have developed it, translating it into requirements that non-technical stakeholders can evaluate, and packaging it into tools or workflows that the broader enterprise can adopt without needing to replicate every analyst’s two years of informal experimentation.

For AI tools for security operations, the implication is that cultural adoption is not a soft problem — it is the last technical problem. The organizational infrastructure for capturing, codifying, and distributing AI workflow knowledge does not yet exist in most large enterprises. Building it requires treating institutional knowledge as a first-class artifact, not a side effect of individual use.

Actionable Takeaways

Audit how AI knowledge is currently accumulating in your organization. If your analysts are building AI workflow intuition in personal accounts that the enterprise cannot see, establish a shared knowledge base — even a simple internal wiki — where effective prompts, workflows, and use cases can be documented and shared. This bridges the gap between informal power user knowledge and enterprise requirements.
Apply the silo reflex framework to your own organization's AI posture: ask explicitly whether AI is currently treated as a specialist function or a broadly distributed capability, and whether that designation reflects a deliberate strategy or organizational inertia. For most security operations teams, the answer should be moving toward distributed fluency, not deeper specialization.
When advocating for AI tools for your security team, frame requirements in terms of specific completed workflows and tasks — not model capabilities or fine-tuning specifications. As Hasbrouck notes, the value comes from "completing this task for you," not from providing a fancy model. Requirements documents written in task terms are legible to procurement and to non-technical stakeholders.

Common Pitfalls

Assuming that platform access solves adoption. The Army's experience shows that even after access and cost problems are resolved, a large percentage of the workforce will not use AI tools without active workflow integration, training, and cultural change management. Rolling out a platform and expecting uptake is the most common failure mode in enterprise AI deployment.
Allowing institutional AI knowledge to remain in personal chat histories. When the analysts who develop AI fluency are the only ones who hold that knowledge, the organization is one personnel rotation away from losing it entirely. This is especially acute in military and government contexts where personnel cycles are mandated and predictable — but it applies equally to any enterprise with significant staff turnover.

AI Adoption Lessons for Security Teams in Large Enterprises

Three phases of enterprise AI adoption: access, tooling/cost, and culture — sequential barriers from the Army's experience

Segment Power Users from General Users Before You Deploy

One of the clearest lessons from the Army’s enterprise AI adoption experience is that uniform access policies break power users. When forensics analysts running log-diff queries against 20-page files consume half a month’s token quota in a single session, the problem isn’t their usage pattern — the problem is that the allocation policy was designed for the average email-drafting user, not the security analyst doing the actual heavy lifting.

SOC automation and security operations personnel have fundamentally different throughput needs from general users. A compliance analyst summarizing a two-page policy document and a malware analyst comparing timestamped log files from separate incidents are using the same platform in completely different ways. Flattening them into a single token tier creates friction at exactly the point where AI delivers the most value.

Before deploying any AI platform to a security team, map the usage archetypes first:

General users — document summarization, email drafting, policy lookups. Low token consumption per session.
Power users / security analysts — log analysis, code review, malware triage, multi-document comparison. High token consumption per session.
Automation workflows — RAG pipelines, scripted queries, batch analysis. Variable but potentially very high.

Each tier requires its own allocation policy. The Army’s failure to segment these tiers before rollout meant analysts either burned their quotas by day two or continued using unofficial consumer tools — defeating the purpose of the enterprise deployment entirely.

The Process for Getting More Resources Cannot Be Effectively Impossible

The Army’s token request escalation process — seven signatures, seven written justifications from senior officials who didn’t understand what an LLM was — is an extreme case, but the underlying pattern is common in large enterprises. When the safety valve for power users is a bureaucratic process that no one realistically completes, the de facto policy is that power users get no relief. They route around it.

AI change management for security teams requires that the exception path actually works. If your enterprise AI platform has a mechanism for increasing limits but it requires approval chains that take weeks and explanations that non-technical stakeholders can’t evaluate, that mechanism doesn’t exist in any practical sense. Security teams will find workarounds — and those workarounds will be ungoverned consumer tools.

The practical fix: design the escalation path at the same time you design the base allocation. Define what a power-user justification looks like in terms that procurement and compliance reviewers can evaluate without deep LLM knowledge. Something like “forensics analyst, average session consumes X tokens, requires Y allocation per month” is auditable. “We need more AI tokens” is not.

Fine-Tuning Is a Losing Bet in a Fast-Moving Model Market

When asked about fine-tuning or specializing models for individual departments, the Army’s assessment was direct: the general-purpose models advance faster than enterprise fine-tuning cycles can keep pace with. By the time a fine-tuned model clears procurement, accreditation, and deployment, the frontier model it was tuned against has been superseded — and the fine-tuned variant no longer delivers a meaningful advantage.

This is a significant finding for MLOps teams in security organizations considering custom model development. The instinct to build a “security-specialized” model is understandable, but the resource investment may deliver diminishing returns in an environment where GPT-4 Turbo is replaced by something meaningfully better every few months.

The alternative that the Army found more durable: bespoke workflow tools over specialized models. Instead of fine-tuning a model for a specific task, build a deterministic workflow that wraps a frontier model and handles the task end-to-end. The workflow defines the inputs, outputs, and logic. The model provides the reasoning. When the model improves, the workflow inherits the improvement automatically — no re-tuning required.

For security teams, this maps cleanly to use cases like:

Automated log triage (structured input → LLM analysis → structured output)
Malware report summarization for non-technical stakeholders
Incident response playbook lookup against a RAG knowledge base
Threat intel enrichment pipelines

In each case, the value comes from the workflow design, not from a customized model. The frontier model does the heavy lifting.

The Silo Reflex as an Organizational Risk Indicator

The silo reflex — designating one person as the AI expert and routing all questions through them — is a predictable organizational response to any new technology. It happened with computers. It happened with cybersecurity. It is happening now with AI. The question for security leaders isn’t whether the silo reflex will occur; it’s whether AI belongs in the silo long-term or needs to be distributed across the enterprise.

Technologies like specialized drones or satellite systems can stay in a box — only a small number of people need deep expertise. Technologies like PCs and GPS cannot stay in a box — they have to be generalized because every role depends on them. AI is currently tipping toward the PC side of that spectrum.

For military AI adoption lessons applied to security teams, this means the window for treating AI as a niche skill owned by one or two people is closing. The security teams that will operate effectively in two to three years are the ones where analysts at every level can construct a useful prompt, interpret an LLM output critically, and recognize when the model is hallucinating. That’s not a specialist skill anymore — it’s baseline fluency.

The organizational risk of leaving AI in the silo isn’t just capability loss. It’s that requirements never get captured. The real-world knowledge about what analysts need from AI tools currently lives in individual chat histories scattered across unofficial platforms. That institutional knowledge never reaches procurement, never shapes the next enterprise agreement, and never gets baked into the tools that the rest of the organization uses. Breaking the silo is how you close that loop.

Extracting Requirements from the Field Before the Next Procurement Cycle

The Army’s current Phase 3 challenge — translating analyst chat histories into legible procurement requirements — points to a broader problem in large enterprise AI deployments: the people who know what the tools need to do are not the people who write the contracts.

For security operations leaders, this means building a deliberate feedback loop between the analysts doing the work and the process that defines what gets bought next. Practically, this looks like:

Structured use case logging — Ask analysts to document high-value AI sessions: what the task was, what they asked, what the output was, whether it was useful.
Pattern analysis across sessions — Identify which use cases appear repeatedly. These are the ones worth wrapping in a managed workflow tool.
Translating technical needs for procurement reviewers — The output of that analysis should be framed in terms of task completion and measurable outcomes, not model specifications.

The goal is to move requirements out of individual chat histories and into a form that can inform the next enterprise agreement — before the four-year procurement cycle locks in the wrong tool again.

Actionable Takeaways

Segment your AI platform users into at least two tiers — general users and security power users — before deployment, and set separate token or usage allocations for each. Design the escalation path for power users to request additional capacity using criteria that non-technical approvers can evaluate without LLM expertise.
Prioritize building bespoke workflow tools over investing in fine-tuned or specialized models. Wrap frontier models in deterministic pipelines for your highest-value security use cases (log triage, malware summarization, incident report generation). The workflow persists and improves automatically as underlying models advance.
Build a structured feedback loop between analysts and procurement before the next enterprise AI agreement cycle. Have analysts log high-value AI sessions, identify recurring use cases, and translate those into task-completion requirements that can be evaluated without deep LLM knowledge by the stakeholders who control the contract.

Common Pitfalls

Applying a uniform token allocation policy across all users without distinguishing between general office users and security power users. This reliably fails forensics analysts and other high-throughput users within the first days of a rollout, driving them back to unofficial consumer tools and leaving enterprise tooling underutilized.
Investing procurement cycles and engineering effort in fine-tuning models for security-specific tasks when the pace of frontier model improvement outstrips the time required to tune, accredit, and deploy a custom model. By the time the specialized model is live, the general model it was based on has been superseded.

Conclusion

Lt. Col. Hasbrouck’s three-phase framework is a rare, ground-level account of what enterprise AI adoption actually looks like from inside a million-person organization — not from a vendor whitepaper or a consulting deck. The arc is clear: access problems yield to cost problems, which yield to the cultural challenge that no procurement agreement can solve on its own.

The practical implications for security teams are specific and immediate. Audit shadow use before assuming bans are working. Map usage tiers before deploying token budgets. Design escalation paths that non-technical approvers can actually navigate. Build workflow tools, not fine-tuned models. And break the silo before the window closes and your best analysts’ two years of AI intuition rotates out with them.

The organizations that move through all three phases deliberately — rather than discovering each constraint after the previous one is “solved” — are the ones that arrive at genuine AI fluency before the gap between them and their adversaries widens too far.

For further reading on adjacent topics covered on this site:

AI/ML security — the broader landscape of vulnerabilities, attack surfaces, and defenses in AI-powered systems
Enterprise AI security — governance frameworks, deployment patterns, and risk management for large-scale AI rollouts
SOC automation — how security operations teams are integrating AI into detection, triage, and response workflows

References & Tools

Google AI Studio — Google's direct API interface for Gemini models, used by Army Cyber personnel as an unofficial shadow tool during Phases 1 and 2 when enterprise options were unavailable or insufficient. ↩
ChatGPT (OpenAI) — Consumer-grade frontier model interface (GPT-4 Turbo) used by Army Cyber power users in unofficial shadow capacity before official tools were approved; set the quality benchmark that sanctioned tools were measured against. ↩
Camo GPT — The US Army's first accredited AI chatbot, built on Llama 2 70B and cleared for Controlled Unclassified Information in late 2023; illustrated three critical failure modes (model quality gap, missing UX primitives, availability instability) that prevented power user adoption. ↩
genai.mil — DoD-wide enterprise AI platform providing access to frontier models through proper enterprise agreements and production-grade front ends; available to all Department of Defense personnel with NIPRNet access. ↩

Three Phases of AI Adoption | [un]prompted 2026

Phase 1: The Access Problem and Shadow AI Use

The Blanket Ban: How Enterprise AI Adoption in Cybersecurity Starts

Shadow AI Use: The Predictable Organizational Response

The First Sanctioned Tool: Camo GPT

Camo GPT: The Army’s First Accredited AI Chatbot and Why It Fell Short

What This Phase Teaches Security Engineers

Phase 2: Inadequate Tooling and the Token Cost Trap

From Shadow Use to Sanctioned Tools — and Back Again

Failure Mode 1: Model Quality Gap

Failure Mode 2: Missing Front-End Capabilities

Failure Mode 3: Availability and Performance

The 2024 Platform Upgrade — and the Token Cost Trap

Token Quota Collapse: How a Forensics Log Diff Query Burned Half a Month’s Allowance

What This Means for Security Teams Deploying AI

Phase 3: Enterprise Agreements and the Culture Problem

The Arrival of Proper Enterprise AI Agreements

The Silo Reflex: AI as the New “Computer Guy” Problem

The Silo Reflex: AI as the New “Computer Guy” Problem

Where Does AI Fall on That Spectrum?

The Knowledge Extraction Problem

AI Adoption Lessons for Security Teams in Large Enterprises

Segment Power Users from General Users Before You Deploy

The Process for Getting More Resources Cannot Be Effectively Impossible

Fine-Tuning Is a Losing Bet in a Fast-Moving Model Market

The Silo Reflex as an Organizational Risk Indicator

Extracting Requirements from the Field Before the Next Procurement Cycle

Conclusion

References & Tools

Questions from the audience

Related deep dives

Kinetic Risk: Securing and Governing Physical AI in the Wild | [un]prompted 2026

Securing Workspace GenAI at Google Speed | [un]prompted 2026

The AI Security Larsen Effect - How to Stop the Feedback Loop | [un]prompted 2026

Glass-Box Security: Operationalizing Mechanistic Interpretability | [un]prompted 2026