unikode

Research note / Enterprise AI

Everyone has AI. Few can turn it into work.

The AI adoption gap is now an execution gap: companies have tools, but not the systems to turn model output into owned, reviewed, measurable work. The missing layer is structured intelligence.

88% McKinsey respondents reported regular AI use in at least one function McKinsey 2025 60% firms in BCG research reported minimal revenue and cost gains from AI BCG 2025 21% Deloitte respondents reported mature governance for agentic AI Deloitte 2026 >40% agentic AI projects Gartner forecasts will be canceled by the end of 2027 Gartner 2025
On this page

Everyone has AI now. Access is no longer the advantage. The advantage is whether an organization can turn model output into accountable work: work with context, ownership, source records, review paths, measurable outcomes, and a way to learn from what happened.

The market has moved past the simple adoption question. McKinsey reports broad AI use, but much smaller shares report scaled programs, EBIT impact, or high-performer status. BCG finds that many firms still see minimal material value despite substantial investment. The gap is no longer only about who has tools. It is about who can absorb the output into the way the business actually runs.

This is where structured intelligence becomes concrete. It is the operating layer that gives AI work usable context, evidence, controls, ownership, review, and institutional learning. Without that layer, useful output can die before it becomes useful work.

01

Adoption has become too broad a word. The denominator changes the story.

02

Many AI programs stall after the demo because the business process around the output is missing.

03

Productivity is a workflow property, not a model property.

04

Structured intelligence is the missing layer: context, evidence, controls, ownership, review, and learning.

AI adoption numbers hide the real problem

AI adoption is often discussed as a single curve. That framing hides the real problem. Different sources measure different units: organizations, firms, workers, employment-weighted firms, product usage, and tasks.

McKinsey reports that 88% of survey respondents say their organizations use AI regularly in at least one business function. The Census Bureau reports a much lower past-two-week current-use range for BTOS businesses. The Federal Reserve shows why both can be true: adoption looks different when the unit is the firm, the worker, or the worker inside an AI-adopting firm.

Figure 1 Adoption changes when the unit of analysis changes.

Use these rows as a denominator map, not as a funnel or a benchmark average.

These measures should not be averaged. They explain why adoption can look mature in one dataset and early in another.

Using AI is not the same as operationalizing it

The most useful McKinsey finding is not just that 88% of respondents report regular AI use. It is the gap between that figure and the smaller set of organizations reporting scaled programs, EBIT impact, and high-performer status. McKinsey

Figure 2 Adoption is broad. Operating absorption is narrower.

The gap opens after access, when work has to be redesigned around review and accountability.

Values are survey-reported shares from one McKinsey survey. They are related maturity signals, not a single conversion funnel.

The bottleneck is not the first draft, answer, summary, or suggestion. The bottleneck is the system around the output: who requested it, what source material it used, what standard applies, who owns the result, who reviews it, and how the organization learns from it.

Where AI value dies

Value rarely dies at the moment of generation. It usually dies afterward. A pilot can produce a strong answer and still fail if the company cannot place that answer inside the way the business actually runs.

The sources point to five recurring failure modes: value is not measured, governance trails deployment, data context is fragmented, agents are given more ambition than controls, and employees use AI faster than the organization redesigns the work around them. BCG IBM

Figure 3 The hard part starts after the output.

Enterprise AI value fails when useful work cannot move through context, control, and measurement.

These sources measure different populations and methods. Read the figure as a failure-mode map, not a single benchmark.

This is the downside of the current cycle. Companies can spend on licenses, demos, and pilots while still leaving managers with unowned outputs, unclear review standards, uncertain data quality, and no durable way to learn from completed work. The organization may look active without becoming more capable.

IBM's 2025 CEO study points to the same issue from the executive side: surveyed CEOs reported that only 25% of AI initiatives had delivered expected ROI over the previous few years, only 16% had scaled enterprise-wide, and 50% said rapid investment left disconnected technology. Deloitte's 2025 ROI survey found rising investment, but only around one in five surveyed organizations qualified as AI ROI Leaders. These are different measures. They point to the same operating problem: spend is easier to approve than value is to measure and absorb. IBM Deloitte

AI changes tasks before it changes jobs

Anthropic's Economic Index is valuable because it looks below the org chart. The first report found that roughly 36% of occupations had AI use in at least a quarter of associated tasks, with observed usage leaning toward augmentation at 57% versus automation at 43%. Anthropic

The March 2026 update showed usage becoming less concentrated inside Anthropic's Claude data. The top 10 tasks fell from 24% of Claude.ai traffic in November 2025 to 19% in February 2026, and about 49% of jobs had at least a quarter of tasks observed in Claude usage. Anthropic Economic Index

Figure 4 Usage is spreading across tasks while feasibility and adoption remain uneven.

The most granular evidence in this source set is task-level and Claude-specific.

Anthropic's data is platform-specific and classifier-mediated. It is useful for direction and task composition, not for total labor-market measurement.

This is how enterprise AI should be evaluated. The better question is not whether a job is "automated." It is which tasks have enough context, repetition, evidence, and review discipline to become repeatable AI-supported work.

AI works when the workflow is ready

Productivity is a workflow property, not a model property. The evidence is not a single number. It is a set of boundary conditions. A QJE study of 5,172 customer-support agents found that access to a generative AI assistant increased issues resolved per hour by 15% on average, with larger gains for less experienced and lower-skilled workers. QJE

METR found the opposite in a different setting. In a randomized controlled trial with 16 experienced open-source developers working on familiar repositories, access to early 2025 AI tools made completion time 19% longer. Developers had expected a 24% speedup. METR now labels the result as historical evidence because model capability has moved since the study window. METR

Figure 5 AI productivity depends on the work system around the model.

The same model class can help or slow work depending on context, task shape, and review load.

The studies use different tasks and methods. The relevant pattern is variance, not a universal productivity estimate.

The variance is the lesson. AI helps when the task is bounded, the source material is available, the output can be reviewed, and the feedback loop is short. It can slow work down when the user has to reconstruct context, verify too much, or absorb errors that the workflow did not anticipate. The model matters, but the surrounding work system determines whether capability turns into productivity.

Agents expose the operating-model gap

Agents matter because they expose what chat can hide. A person can use a model informally and absorb the risk alone. A system that acts on business work needs authorization, source discipline, policy boundaries, exception handling, and a reviewer who can trust the path taken.

The data does not support a claim that agents are already running the enterprise. It supports a more precise claim: agent activity is real, production deployment is early, and weak operating controls are a likely failure mode.

Figure 6 Agent ambition is ahead of production maturity.

Agents convert informal assistance into an operating-model question.

Gartner's figure is a forecast. McKinsey and Stanford use different definitions. Read the figure as a maturity map, not a direct comparison.

This is why agents are not mainly a feature question. They are an operating-model question. If software can act, the organization must define the allowed action, the evidence threshold, the reviewer, the exception path, and the feedback loop.

The bottleneck moves from output to review

Faster work is the first-order effect. More work entering the system is the second-order effect. The third-order effect is that review, approval, governance, and quality control become the bottleneck. The fourth-order effect is a new operating model.

First order Outputs get faster.

Drafts, summaries, analyses, and code suggestions can appear quickly.

Second order Demand increases.

More people ask for more work because the apparent cost of a first draft falls.

Third order Review becomes scarce.

The organization now waits on context, judgment, approval, and trust.

Fourth order The operating layer changes.

The business needs systems that structure work before AI acts and preserve learning afterward.

The missing layer is structured intelligence

The answer is not to push every process toward autonomy. The answer is to make business knowledge usable before software acts. Documents, policies, research, customer context, decisions, approvals, and expert judgment need to become structured intelligence: managed, reviewable, reusable, and ready for accountable work.

This is unikode's interpretation of the evidence: the next useful software layer will be judged less by whether it adds another prompt box and more by whether it preserves context, binds work to evidence, routes review, keeps human direction intact, and turns completed work into institutional learning.

Context before output

Task, policy, source material, customer record, and review standard should be present up front.

Work object before workflow

The business should define the thing being produced or changed before asking software to act.

Review matched to risk

NIST points organizations toward risk management across design, development, use, and evaluation.

Learning after completion

Microsoft finds organizational factors are more strongly associated with reported AI impact than individual effort alone.

The strategic question is no longer, "Do we have access to AI?" It is, "Can our organization convert intelligence into governed execution without losing context, control, or learning?" The winners will not be the companies with the most AI usage. They will be the companies that can turn intelligence into accountable work.

What this analysis does not show

The sources use different samples, time windows, definitions, and methods. The charts are evidence signals, not a combined benchmark. Vendor and consulting sources are paired with public-sector and academic evidence because no single source measures the whole market.

This analysis also does not argue that every workflow should become agentic. Some work is better handled by ordinary software, a focused assistant, or direct professional judgment. The narrower claim is that consequential AI work needs structured context, ownership, measurement, and review.

Evidence behind the argument

These sources were selected because they separate adoption, task diffusion, scaling, productivity, and risk. That separation is necessary for a defensible view of enterprise AI maturity.

McKinsey: The State of AI, 2025

Used for adoption, enterprise scaling, EBIT impact, high-performer share, workflow redesign, and early agentic AI adoption indicators.

View report

BCG: The Widening AI Value Gap

Used for the gap between AI investment and material value, including BCG's finding that many firms report minimal revenue and cost gains.

View report

IBM: CEOs double down on AI while navigating enterprise hurdles

Used for CEO-reported ROI, enterprise-wide scaling, and disconnected technology risks from fast AI investment.

View study

Deloitte: AI ROI and elusive returns

Used for investment momentum, delayed payback, ROI leader concentration, and significant measurable ROI rates for generative and agentic AI.

View analysis

Stanford HAI: 2026 AI Index Report

Used as the annual reference point for corporate AI adoption, labor-market signals, and productivity research summaries.

View report

Stanford HAI: 2026 AI Index, Economy chapter

Used for generative AI business use, agent deployment maturity, productivity findings, and labor-market caveats.

View chapter

U.S. Census Bureau: AI Use at U.S. Businesses

Used for firm-level U.S. business adoption from the Business Trends and Outlook Survey, including the 17% to 20% past-two-week current-use range from December 2025 to May 2026.

View analysis

Federal Reserve: Monitoring AI Adoption in the U.S. Economy

Used for the distinction between firm adoption, worker adoption, and employment-weighted exposure through 2025.

View note

Anthropic: Introducing the Economic Index

Used for task-level evidence from Claude conversations, including occupation coverage and the augmentation versus automation split.

View report

Anthropic Economic Index: Learning curves

Used for the March 2026 update on task diversification and the share of jobs with at least a quarter of tasks observed in Claude usage.

View report

Anthropic: Labor market impacts of AI

Used for the distinction between theoretical AI capability and observed task coverage across occupations.

View report

Microsoft: 2026 Work Trend Index

Used for the argument that individual AI use is ahead of the organizational systems needed to support and compound it.

View report

Deloitte: Agentic AI is scaling faster than guardrails

Used for the gap between agentic AI ambition and mature governance practices across surveyed enterprises.

View analysis

Gartner: Agentic AI project cancellation forecast

Used as a caution that unclear business value, cost, and risk controls can stall agentic AI projects before production.

View release

NIST: AI Risk Management Framework

Used for the risk-management baseline around trustworthiness considerations in the design, development, use, and evaluation of AI systems.

View framework

IBM: The True Cost of Poor Data Quality

Used for the data-quality and governance barrier to scaling AI, including concerns about data accuracy and bias.

View analysis

IBM: The Biggest AI Adoption Challenges for 2026

Used for the broader list of enterprise deployment constraints: fragmented data, governance, security, skills, cost justification, and workflow integration.

View analysis

QJE: Generative AI at Work

Used for field evidence that AI assistance increased customer-support productivity by about 15% on average, with heterogeneous worker effects.

View paper

METR: Early-2025 AI and open-source developer productivity

Used as historical early-2025 randomized evidence that AI can slow experienced developers in complex, familiar repositories when review and context burdens are high.

View study