A company hires an AI vendor. The vendor builds a model. The model goes into production. Six months later, leadership asks a reasonable question: is this thing actually working?
Nobody can answer. Not because the model is broken, but because nobody defined what "working" means before the project started. There are no baselines to compare against. No targets to measure toward. No KPIs that connect the model's output to a business outcome anyone cares about.
This happens more often than it should. And the problem isn't AI — it's measurement maturity.
What Measurement Maturity Actually Means
Measurement maturity is the organizational capability to define what matters, quantify where you stand, and track whether things are improving. It has four components that build on each other.
- ✓KPIs. Which metrics actually drive your business? Not vanity numbers or things that are easy to count — the metrics that connect to revenue, efficiency, customer outcomes, or operational health. If you run 90 behavioral health locations, that might be billable hours per clinician, claim denial rates, or time from service to reimbursement.
- ✓Baselines. Where do those metrics stand right now? Without a baseline, any improvement claim is a guess. This is often the hardest step, because it requires clean, reliable data — which many organizations don't have yet. That's fine. Getting to a trustworthy baseline is valuable work in itself.
- ✓Targets. Where should those metrics be in three months, six months, a year? Targets give AI projects a finish line. They make it possible to say "this model reduced claim denials by 12% against a target of 15%" — which is infinitely more useful than "we deployed a model."
- ✓Feedback loops. How do you check progress and adjust? This is what separates a one-time measurement exercise from actual maturity. Feedback loops mean someone is looking at the numbers regularly, comparing actuals to targets, and making decisions based on what they see. It's the difference between a dashboard that exists and a dashboard that drives action.
These four elements are simple to describe and often difficult to implement. Most organizations have some version of KPIs — a spreadsheet, a quarterly report, a set of numbers someone tracks. But having metrics and having measurement maturity are different things.
Why AI Needs This More Than BI Ever Did
Business intelligence can work with loose metrics. A dashboard that shows revenue by region is useful even if nobody has defined what "good" looks like. People see the numbers, they notice trends, they draw their own conclusions. BI is fundamentally a visibility tool — it shows you what's happening and lets humans decide what to do about it.
AI is different. AI takes action — or recommends action — based on patterns in your data. To train a model, you need to define what a good outcome looks like. To evaluate a model, you need to compare its predictions against reality. To improve a model, you need a feedback loop that tells it where it was right and where it was wrong.
Every one of those steps requires measurement maturity. Without KPIs, you don't know what the model should optimize for. Without baselines, you can't tell whether the model improved anything. Without targets, there's no way to know whether the investment was worth it. Without feedback loops, the model can't learn and the organization can't course-correct.
This is why measurement maturity sits at the base of the AI Foundations framework. Data governance gives you trusted data. Pipelines move it reliably. Measurement maturity tells you what to do with it — and whether what you did worked.
What This Looks Like in Practice
Consider a 90-location behavioral health organization. Before any AI can help with scheduling optimization, claims processing, or clinical workflow automation, leadership needs to answer straightforward questions: How many billable hours is each clinician delivering per week? What's the average time from service delivery to claim submission? What percentage of claims are denied on first submission, and why?
These sound like simple questions. They're not — at least not when data is spread across multiple EMR systems, billing platforms, and payroll tools across 90 locations. Getting a reliable baseline for any of these metrics is a project in itself. But it's a project that pays dividends immediately, because the baseline alone often reveals operational gaps that have been invisible.
We've seen this pattern across industries. A high-end appliance retailer needed AI-powered pricing recommendations, but first needed to define what "optimal pricing" meant for their business: margin targets by product category, competitive positioning benchmarks, and volume thresholds that trigger different strategies. The measurement framework came before the model — and it made the model dramatically more useful, because everyone agreed on what success looked like before the first prediction was generated.
Building Measurement Maturity as Part of the Engagement
Some consultancies treat measurement maturity as a prerequisite — a gate you have to pass before AI work can begin. That approach has a real cost: it turns what should be integrated work into sequential phases, delays the AI outcomes that leadership is asking for, and often loses organizational momentum.
At VisionWrights, we build measurement maturity as part of the AI engagement. Not because we skip it, but because we've found it works better when it's integrated with the use case that needs it. When you're defining KPIs for a specific AI application — say, an agent that handles dispatch scheduling — the conversation becomes concrete. You're not talking about measurement in the abstract. You're deciding what "good scheduling" means for your business, right now, for this system.
That concreteness matters. It makes the measurement work feel like progress toward the AI outcome, not a detour. And it produces measurement frameworks that are immediately useful, because they're tied to a system that will consume them.
The practical approach looks like this:
- Start with the business question. What decision or process is AI supposed to improve? That question defines which KPIs matter.
- Establish baselines early. Even rough baselines are better than none. You can refine them as data quality improves, but you need a starting point to measure against.
- Set targets with leadership, not for them. Targets should reflect business ambition, not technical possibility. An AI model can optimize for whatever you tell it to — the question is whether the target reflects what leadership actually wants.
- Build the feedback loop into the system. Don't bolt it on later. The mechanism for checking whether the AI is performing against targets should be part of the initial design — automated where possible, reviewed by humans on a regular cadence.
The Compounding Effect
Here's what makes measurement maturity worth the investment beyond the immediate AI project: it compounds. Once an organization has defined KPIs, established baselines, and built the habit of tracking against targets, every subsequent initiative — whether it's AI, automation, or operational improvement — benefits from that infrastructure.
The second AI project takes less time to scope, because the measurement framework already exists. The third project is even faster. Leadership gets more comfortable with data-driven decision-making, because they've seen the pattern: define the metric, set the target, measure the result. Trust builds incrementally.
Organizations that skip this step often end up in a frustrating cycle: build an AI system, deploy it, struggle to prove its value, lose executive confidence, hesitate to fund the next initiative. Measurement maturity breaks that cycle by making AI outcomes visible and verifiable from the start.
Where This Fits in the AI Foundations Framework
Measurement maturity is the second foundation layer in the AI Foundations framework. Below it: Data Strategy & Governance (knowing what data you have and keeping it trustworthy) and Data Engineering & Pipelines (moving that data reliably). Above it: Analytics & BI (validating that the numbers are right by making them visible) and AI & Intelligent Automation (taking action based on trusted, measured data).
Each layer depends on the ones beneath it. AI can't be accountable without measurement. Measurement can't be reliable without clean pipelines. Pipelines can't deliver value without governed data. The layers aren't sequential prerequisites — they're integrated, and they often get built in parallel. But they all need to exist.
If your organization is considering AI — or has already started and is struggling to demonstrate value — the most productive question might not be about the model, the vendor, or the technology. It might be simpler: do we know how to measure whether this is working? If the answer is no, that's where we start.