The Automation Reckoning

A Confession Dressed in Mathematics: On the $6.7 Billion Bet That Fails One Time in Five

Feb 28, 2026

Thoughts on the Bear Brown & Company Venture Capital Due Diligence Report: Agentic AI Sector (Bear Brown & Company, Substack)

There is a formula buried in the middle of this venture capital due diligence report, and it is the most honest thing in it.

What this says, stripped of its notation: if your AI agent succeeds at each step 95% of the time, a workflow requiring thirty steps succeeds just over one time in five. Not most of the time. Not reliably. One time in five. The document presents this as a “technical constraint.” I find myself reading it as something else—a confession dressed in mathematics, an admission that the infrastructure being celebrated across two dozen funding rounds and $6.7 billion in capital is, under examination, a system that fails at a rate no human worker would tolerate in themselves and no employer would tolerate in a human.

That gap—between the funding narrative and the arithmetic—is what this essay is about.

The Thing Being Built

The document under examination is a venture capital due diligence report on the “agentic AI sector,” published by Bear Brown & Company on Substack. It is written in ISE framework style—precise, structured, and admirably willing to say uncomfortable things in footnotes while burying them in sections that follow twenty pages of optimism. The sector it describes has funded, in roughly twelve months, companies valued at $183 billion (Anthropic), $500 billion (OpenAI), and individual applied startups ranging from $4 million seed rounds to $12 billion pre-revenue bets on “talent arbitrage.”

What these companies are building—what the report calls “agents”—are AI systems capable of executing multi-step workflows without human intervention at each decision point. Not copilots. Not assistants. Agents. The distinction matters because it determines whether AI is, as the document puts it, “a productivity multiplier or a labor substitute.” The 2023 cohort built tools that required a human to approve every action. The 2025 cohort is building systems that escalate to humans only when they cannot proceed alone. The implication is directional and the document does not obscure it: “Fully AI Employees now months rather than years away.”

I want to stay with that sentence for a moment. Not to celebrate it, not to condemn it, but to ask what it means to write those words and then, four sections later, acknowledge that 30% of enterprise pilots were “killed by technical debt” when agents tried to integrate with the legacy Oracle, SAP, and Salesforce systems that actually run the enterprises they were meant to replace.

The answer, I think, is that venture capital reports are acts of persuasion before they are acts of analysis. This one is better than most—the section on ESG explicitly calls out labor displacement ethics as “intellectually dishonest” to treat as abstraction—but even good analysis operates within a frame. The frame here is: the threshold has been crossed, the category exists, the returns are available to those who invest correctly. Everything uncomfortable appears within that frame, subservient to it.

The formula, though, refuses to be subordinated. It sits in Section 6 like a splinter.

What the Arithmetic Does to the Argument

Here is the specific mechanism the report identifies as the sector’s central technical breakthrough: tool-calling error rates dropped from 40% to 10% between 2024 and 2025. The document calls this “not incremental improvement”—and it’s right. Going from failing four times in ten to failing once in ten is a genuine leap. It is the difference, as the report correctly observes, between a demo and a deployment.

But the formula transforms this progress into its actual implications. At 95% per-step reliability—better than the stated 90%—a ten-step workflow succeeds 60% of the time. A twenty-step workflow, 36%. A thirty-step workflow, 21%. These are the success rates for the multi-day, multi-action workflows that the sector’s $10 billion valuations are premised on capturing.

The pharmacy agent entering prescriptions. The insurance verification system classifying denials. The cybersecurity platform processing 80 million signals per day. Each of these is not a single action. Each is a chain of dependent decisions, and each link in the chain multiplies the failure probability of every other link.

The report acknowledges this. It then proceeds to argue, in its investment thesis, that the threshold has been crossed for “enterprise viability.” Both of these things cannot be entirely true. What the report is actually describing—though it does not quite say this—is a sector premised on the bet that reliability will improve fast enough, and that the workflows enterprises deploy agents on first will be short enough and forgiving enough, that the arithmetic doesn’t become visible before the switching costs have locked customers in.

That is not fraud. It is a thesis. But it is a different thesis than “the threshold has been crossed.”

The Labor It Displaces

I want to be direct about the workforce question in a way the report is not, even though the report is more honest than most.

The document frames agentic AI valuation on “potential to capture the total labor value of the functions they automate.” Customer experience labor in the US: approximately 3 million workers at median fully-loaded cost of $55,000 each. That is $165 billion in labor spend. The report calculates this cleanly and correctly labels it the “theoretical outer bound.”

What it does not do is ask what happens to the 3 million people.

This is not an oversight. It is a genre constraint. A venture capital due diligence memo is not obligated to concern itself with the workers whose displacement generates the returns. The ESG section notes, to its credit, that “investors in this sector are making a bet on workforce displacement at scale” and that treating this as “a social impact abstraction” is “intellectually dishonest.” Having named the thing, though, the document does not pursue it. It documents it and moves on to the financial model.

I am not moving on.

The document profiles VoiceCare AI as a solution to the fact that 70% of US pharmacy locations are understaffed. This is presented as solving a labor mismatch. Look at what is actually being described: pharmacies cannot find or retain enough workers, so the solution is to replace the workers who would have been hired with an AI system that processes prescription intake at scale. The labor mismatch is solved by ensuring that there is no longer a demand for the labor. The workers who might have filled those roles—the ones the shortage was supposedly preventing—are not mentioned after the problem statement.

This is not a conspiracy. It is the logic of capital. But a sector that will process $6.7 billion in annual funding while replacing millions of workers deserves to have that logic named rather than implied.

What the Moat Actually Is

The report’s most analytically precise section concerns competitive advantage, and here the document earns its ISE designation. The central claim: “The model is not the moat.” Anthropic’s $183 billion valuation and OpenAI’s $500 billion are not entry points; they are incumbent costs that define the competitive landscape within which the applied agent layer operates. The viable moat strategies are switching costs, data flywheels, and counter-positioning against incumbents who cannot copy the challenger without breaking their existing business.

Sierra’s counter-positioning against Salesforce is the clearest example. Bret Taylor ran Salesforce. He knows precisely where its architecture breaks and which enterprise customers are most frustrated with it. He has built a company that solves the problem Salesforce cannot solve without rebuilding itself—a rebuilding that would break every existing customer. This is Hamilton Helmer’s framework in its textbook form, and the document is right to identify it as one of the more defensible competitive positions currently visible.

But here is what the switching cost moat actually means for the enterprises being locked in. Once a Sierra agent is integrated into live operations—unified with billing, inventory, and customer conversation data—migration cost is described as “enormous.” The document frames this as investor-favorable: retention is built into the architecture. What it describes, from the enterprise’s perspective, is a vendor relationship in which the costs of exit compound with each passing quarter of integration depth. The moat that protects Sierra’s returns is the same structure that constrains its customers’ choices.

This is not unique to agentic AI. Every SaaS company with strong NRR is benefiting from some version of this dynamic. What is different here is the depth of integration being described. An agent unified with live operational data, processing millions of interactions per month, developing proprietary understanding of a brand’s edge cases—this is not a software tool that can be migrated. It is an operational dependency.

The report notes that Sierra’s containment rate is “not just a performance metric. It is a data accumulation metric.” Every resolved query trains the agent further on that specific customer’s environment. I find myself thinking about what it means to build a system whose improvement is inseparable from its entrenchment. This is the thing that will make these companies extraordinarily valuable. It is also the thing that will make them extraordinarily difficult to leave.

The Question the Formula Asks

Return to the arithmetic. A 30-step workflow at 95% per-step reliability succeeds 21% of the time. The report identifies this as the “central technical constraint” while simultaneously arguing that the threshold for enterprise deployment has been crossed.

These positions can be reconciled—but only if you accept that “enterprise deployment” means something more limited than the funding narrative implies. Not replacing human workers wholesale, but replacing them in short, well-defined, low-stakes workflows. Not the complex insurance denial classifications or the pharmacy prescription entries that require 20+ decision steps, but the repetitive, low-variance workflows where five steps suffice and the cost of the occasional failure is manageable.

That is a viable sector. It is not the sector being funded at $10 billion valuations.

What the $10 billion valuations price in is the assumption that reliability will improve—that 95% per-step will become 99%, that 99% will become 99.9%, that the arithmetic ceiling will be raised by engineering before it becomes visible as a constraint. That is a bet on a technical trajectory, and the bet may be correct. The per-step error rate did drop from 40% to 10% in a single year. There is no law of physics preventing further improvement.

But the document, to its credit, identifies the real threat: “What happens when the first high-profile agentic failure occurs in a regulated vertical?” One pharmacy agent that enters the wrong prescription. One insurance agent that misclassifies a denial for someone who needed coverage. The report calls this a “liability” problem distinct from a “reliability” problem. One is technical. The other is legal. One can be solved by engineering. The other will be solved by lawyers, regulators, and the enterprises that decide, after the headline, that the switching costs of staying outweigh the reputational costs of being associated with the failure.

That event is not in the financial model. No financial model can hold it. But it is the variable that determines whether the 15% probability of category leadership the report assigns to companies like Cognition AI should be 15% or 5%.

The formula asks a simple question: how many steps before the system fails? The sector has not yet answered it honestly. The investment capital has gotten there first. That is, historically, how these things tend to go—and why the mathematics in the middle of a VC report deserves more attention than the executive summary that precedes it.

Here is what we must ask ourselves: what does it mean to fund, at scale and with genuine sophistication, a sector whose central thesis is that humans are the error in the workflow? The answer may be that it is simply the next stage of industrial automation, as inevitable as the loom or the assembly line. But the loom and the assembly line also changed everything. The least we owe to that transformation is to call it by its name.

Tags: agentic AI investment thesis critique, venture capital reliability arithmetic, labor displacement automation ethics, switching cost moat enterprise AI, ISE framework due diligence analysis

Rajesh Kumar Rama Reddy

someone actually did the math and then refused to look away from it. The 21% success rate on a 30-step workflow buried in a $6.7B funding narrative is the kind of detail that deserves a lot more daylight than it gets

Vatsal Naik

The math here is brutal and simple: 95% accuracy sounds great until you chain 30 steps together and succeed only 1 in 5 times. We're not funding products — we're funding a bet that engineering will outrun the arithmetic before a wrong prescription or denied claim makes the headline. The workers being 'freed from labor mismatches' don't appear in the financial model. Neither does the first major failure in a regulated industry. Both will show up eventually.

Nik Bear Brown - Computational Skepticism

Discussion about this post

Ready for more?