The Causal Brain: Living Models and the End of Backward-Looking Analytics

hypothetical.ai explores the creation of realistic real-time hypothetical scenarios

Mar 15, 2026

There is a particular kind of organizational suffering that does not announce itself as suffering. It looks like a Monday morning meeting. Someone opens a dashboard. The numbers from last quarter are arrayed with precision — revenue by region, churn by segment, customer acquisition cost trending upward in a way that everyone in the room has already begun to explain without anyone having explained it yet. The meeting lasts ninety minutes. Three action items are logged. Nothing changes. Six weeks later, the same dashboard. The same meeting. The same explanation that is not an explanation.

This is the cost of operating inside a world your analytics system cannot actually see. The dashboard told you what happened. It did not tell you why. It certainly did not tell you what would happen if you did something about it. The data was accurate. The model was useless.

The Living Model is the attempt to build something that does not suffer from this problem — a decision support architecture defined by four properties that together represent a fundamental break from the analytics paradigm that has governed organizational intelligence for three decades. It is causal, meaning it maps structural cause-and-effect rather than correlation. It is counterfactual, meaning it can answer “what would have happened if” for scenarios that never occurred in historical data. It is real-time, meaning it continuously ingests live data streams and updates its outputs accordingly. And it is treatment-oriented, meaning it organizes itself around actionable interventions ranked by expected causal impact rather than passive prediction.

That last property is the one that tends to get lost in the marketing copy. Every enterprise software vendor in 2025 claims to offer “real-time AI insights.” Almost none of them mean what the Living Model means. The difference is the difference between a weather report and a climate simulator — between a system that tells you it is raining and a system that tells you what happens to the river if you open the dam.

What Correlation Cannot Do

Judea Pearl’s “Ladder of Causation” provides the clearest map of the territory. At the first rung: association. This is where nearly all commercial AI currently lives. The system observes that X and Y tend to move together and tells you so. The observation is often useful. It is never sufficient.

At the second rung: intervention. Here the system can answer not “how are X and Y related?” but “what happens to Y if I force X to a specific value?” This requires what Pearl calls the do-operator — a formal representation of deliberate manipulation — and it requires the system to have learned not just statistical patterns but the mechanisms that generate them.

At the third rung: counterfactuals. Here the system can answer “what would Y have been if X had been different, in this specific case, at this specific time, given everything that actually happened?” This is the level at which genuine strategic intelligence becomes possible, because it is the level at which you can evaluate decisions you did not make.

The failure of traditional predictive machine learning is not a failure of sophistication. A well-trained XGBoost model can be remarkably accurate on historical data. The failure is structural. When a company changes its pricing strategy, the historical data that trained the pricing model no longer describes the world the company now inhabits. The intervention changed the system. The model, trained on the pre-intervention world, is now a map of a country that has been reorganized. It does not know this. It keeps giving directions.

Statisticians call this the difference between the observational distribution and the interventional distribution. In plain language: the pattern you learned from watching the system is not the same as the pattern the system produces when you act on it. Prediction assumes the future resembles the past. Strategy is the act of making the future different from the past. These two activities require different tools.

The Architecture of the Living Model

The technical implementation of a Living Model begins with a Directed Acyclic Graph — a DAG — which is a visual map of the system’s causal structure. Every node is a variable. Every arrow is a direct causal relationship. The resulting Structural Causal Model converts those arrows into mathematical functions: each variable is expressed as a function of its direct causes plus an exogenous noise term that captures everything unmeasured.

This architecture does something that a regression equation cannot do. It separates the question “what do we observe happening?” from the question “what happens when we act?” The graph encodes the mechanisms of the system, not just its correlations. When you ask the system “what happens if I reduce price by fifteen percent?”, it does not look up the historical relationship between price and sales. It propagates the intervention through the causal structure — accounting for competitive response, customer segment heterogeneity, inventory constraints — and produces a distribution of outcomes across simulated scenarios.

The scale at which this simulation runs is not incidental. The platform literature refers to “thousands of what-if scenarios” as a standard capability, and this is not hyperbole. The computational advance that made this practical is NOTEARS — Non-combinatorial Optimization via Trace Exponential and Augmented Lagrangian — which reframes the problem of learning a causal graph from data as a continuous optimization problem rather than a combinatorial search. Before NOTEARS, causal discovery across high-dimensional datasets was computationally prohibitive. The number of possible causal graphs grows exponentially with the number of variables. NOTEARS makes it tractable. It is, in the unglamorous way of genuine scientific progress, the thing that made the rest possible.

Real-time ingestion is the second architectural requirement, and it is where most enterprise implementations currently fail. A causal model is only as current as the data that updates it. The technical stack required for genuine real-time operation — event capture through systems like Apache Kafka or Redpanda, stream processing through Flink or Spark, real-time query through ClickHouse or Pinot — is mature and available. The organizational barriers to deploying it are not technical. They are the accumulated weight of data architectures built for batch processing, reporting systems designed for the rhythm of the quarterly review, and a decision culture that has never been asked to operate at the speed the data can now support.

Risk Is Probability Times Impact Magnitude

Here the Living Model makes an intervention into the practice of analytics that is underappreciated in its importance.

Traditional risk assessment collapses the problem. It asks: how likely is this bad thing to happen? The result is a probability, and the probability is treated as the risk. This is not wrong exactly. It is incomplete in a way that produces systematically bad decisions.

A ten percent probability of losing one million dollars is not the same as a ten percent probability of losing one billion dollars. Any decision framework that treats these identically has abandoned the purpose of decision-making. Risk is probability times impact magnitude — and collapsing these two dimensions into one loses precisely the information that a decision-maker actually needs.

The Living Model formalizes this through the Expected Value of Intervention. For any proposed strategic action, the EVI is calculated as the product of reliability — the frequency with which the intervention produces positive outcomes — and effect size — the magnitude of the improvement when it does. This is not a novel mathematical insight. It is the formalization of what every experienced strategist already knows and almost no analytics system has been designed to calculate.

What the Living Model adds to this calculation is the counterfactual dimension. The question is not merely “what is the expected value of this intervention?” but “what is the expected value of this intervention compared to what would have happened without it?” Susan Athey’s work on Conditional Average Treatment Effects provides the computational machinery for this distinction. Causal forests — the method she developed with Stefan Wager — allow the estimation of how an intervention’s effect varies across different units, different contexts, different moments in time. This is the difference between knowing that a pricing change increases revenue on average and knowing which customers respond to a pricing change, by how much, and under what conditions.

This heterogeneity is where strategy lives. The average effect is rarely the decision-relevant fact. The decision-relevant fact is the effect on the specific segment, in the specific market, at the specific moment when you are deciding whether to act.

The Plumber’s Objection

Esther Duflo’s “Economist as Plumber” lecture is an underappreciated corrective to the enthusiasm that tends to accompany the announcement of causal AI. Her argument is not against causal inference. It is against the assumption that having the right model is the same as making the right decision.

The plumber’s observation is this: models provide very little guidance on which implementation details will matter. A causal model might correctly identify that fund transfer delays are reducing program participation. What it cannot tell you, without additional investigation, is whether the delay is caused by administrative bottlenecks, verification requirements, banking infrastructure, or the timing of the month relative to harvest cycles. The mechanism matters. The mechanism determines which wrench to use.

This is the limitation that the commercial Living Model literature tends to understate. The platforms are not wrong about what their systems can do. They are often imprecise about what those systems require from the humans who operate them. Automated causal discovery can learn the structure of a system from data. It cannot learn the structure of an implementation failure from data, because the implementation failure is often the reason certain data was never collected.

The practical implication is that Living Models require a different kind of organizational competence than traditional analytics. The skill is not data science in the conventional sense. It is the ability to think structurally about mechanisms — to ask not “what correlates with our churn rate?” but “what are the three or four processes that actually determine whether a customer renews, and which of those processes can we change?” This is domain expertise operating as causal reasoning. It is the thing that turns a sophisticated simulation engine into an organizational asset rather than an expensive dashboard.

The Unconfoundedness Problem

The functional validity of every Living Model rests on an assumption that is almost never perfectly satisfied: unconfoundedness, sometimes called selection-on-observables. This assumption requires that all variables influencing both the decision to intervene and the outcome of the intervention are measured and included in the model.

In a clinical trial, unconfoundedness is achieved by randomization. The coin flip breaks the connection between a patient’s background characteristics and their treatment assignment. No background characteristic can confound the effect estimate because no background characteristic predicts who gets the treatment.

In organizational data, you rarely have a coin flip. You have observational records of what your company decided to do, which were not random. You promoted the sales regions that were already performing. You raised prices in markets where demand was inelastic. You invested in the products customers were already buying. The decisions were intelligent. That intelligence is the problem. Every intelligent decision creates a confounding structure that makes it difficult to estimate the effect of having made a different decision.

The methods developed to address this — Double Machine Learning, Invariant Causal Prediction, instrumental variable estimation — are mathematically sophisticated and organizationally demanding. Double Machine Learning, the method at the core of causaLens’s decisionOS platform, uses orthogonal moment conditions to separate the causal effect of interest from the influence of measured confounders. It requires that you can predict both the treatment and the outcome from observed covariates, that you can do so well, and that the residual variation in treatment — the part that cannot be predicted by background characteristics — is sufficient to identify the causal effect.

What none of these methods can do is measure the unmeasured. The latent confounder — the competitor’s internal pricing meeting, the macro-sentiment shift that preceded the customer’s decision, the organizational change that happened six months before the attrition spike — remains the frontier problem of causal inference. The sophistication of the Living Model does not eliminate it. It makes the model honest about where the uncertainty lives.

From Simulation to Intervention

The clinical trial literature provides the clearest precedent for what the Living Model attempts in organizational settings. In drug development, the simulation comes before the trial: you have a model of the disease mechanism, a model of the drug’s action, and a simulation of the treatment effect across a population of virtual patients. The trial then tests whether the simulation was right.

The Living Model inverts part of this sequence. The simulation happens after — or alongside — the observational data. The model learns the mechanism from historical records, builds a causal structure, and then simulates the counterfactual: what would have happened if we had done something different?

The commercial implementation of this logic — platforms like causaLens, Vedrai’s WhAI, and PrescientIQ — represents the attempt to make this process accessible to decision-makers who are not statisticians. The “no-code causal ML” category is real and growing. What it offers is the ability to ask the causal question without writing the causal code. What it requires, still, is the ability to think causally about the system you are modeling. You cannot outsource the question. You can only outsource the calculation.

The treatment-ranked output — the list of potential interventions ordered by expected causal impact — is the Living Model’s most practically important deliverable. It answers the question that every strategy meeting is implicitly trying to answer: given the resources we have, which action produces the most actual change in the outcome we care about? Not the most correlated action. The most causal one.

What Is Actually Being Built

The honest account of where this technology stands in 2025 is this: the theoretical foundations are mature. The commercial implementations are promising and uneven. The organizational conditions required to deploy them well are rare.

Pearl gave us the mathematical language of causality. Athey gave us the computational tools to estimate causal effects at scale. Duflo gave us the reminder that the model is never the intervention — that the distance between a correct causal estimate and an effective organizational change is filled with plumbing, and the plumbing is usually what fails.

The Living Model is the attempt to build a decision support architecture that does what decades of business intelligence has promised and not delivered: to tell you not just what happened and what is likely to happen, but what you should do about it, and why that action and not another, and how confident you should be, and what the expected value of doing nothing is.

That last question — what is the cost of inaction? — is the counterfactual that traditional analytics cannot ask. It requires knowing what would have happened in the absence of an intervention, which requires having a model of the causal mechanism, which requires having built the thing the Living Model is.

The Monday morning meeting that starts with a dashboard is not going to disappear immediately. The dashboards are good at what they do. But the question they cannot answer — not the question of what happened, but the question of what to do, and the question of what would have happened if you had done it differently last quarter, and the question of which of your possible futures is worth building — these questions are now answerable, in principle, by systems that exist, for organizations willing to do the work of building the causal model of themselves.

The data was never the problem. It was always the question.

I’ve been writing about computational doubt at Skepticism.ai. But this argument — the specific argument about the mismatch between what analytics systems are built to do and what strategic intelligence actually requires — felt large enough to deserve its own space. That’s why I started Theorist.ai: a dedicated home for the question of what organizational intelligence owes the next generation of decision-makers, at the precise moment when machines have become genuinely good at answering questions and genuinely poor at knowing which questions are worth asking.

The Living Model is one answer to that question. Hypothetical.ai is where I’m building another — an exploration of realistic real-time hypothetical scenario generation that puts a causal brain directly in the hands of the people running the Monday morning meeting. That work is large enough to warrant its own space too. More there soon.

Tags: Living Model causal AI, Judea Pearl ladder of causation, counterfactual simulation enterprise analytics, structural causal models organizational strategy, real-time causal inference decision support

Vatsal Naik

Apr 26

The gap between prediction and intervention is exactly what broke our RL trading system initially. We trained agents on historical market data, but the moment they started acting, the correlations they learned stopped holding — the system they were trained on wasn't the system they were trading in. Pearl's ladder wasn't just theory for us, it was a debugging session. The 'data was never the problem' line is the most honest summary of that experience I've read

Nik Bear Brown - Computational Skepticism

Discussion about this post

Ready for more?