What is the right pilot-to-production ratio?

About 50 to 70 percent of well-designed production-bound pilots should productionize. Lower ratios suggest design problems; higher ratios suggest insufficient ambition.

Why Pilots Stall and How to Design Ones That Don't

Q: How long should a pilot last?

Production-bound: 8 to 16 weeks. Exploratory: 4 to 8 weeks. Anything longer is at risk of becoming the system instead of a pilot.

Q: Should we run pilots in production or in a sandbox?

Pilots in real production, with limited scope. Sandboxes don't surface the integration issues that often kill production transitions.

Q: How do we handle multiple stalled pilots?

Triage. Kill the ones that won't go. Pick 1 to 2 to revive with the design checklist applied. Don't try to revive all of them.

TL;DR

Five reasons pilots stall:

No production owner — pilot succeeds but no one is accountable to put it into production.
Wrong success metric — measures user satisfaction or accuracy instead of business outcome.
Path-to-production undefined — no plan for going from pilot to scale.
Sponsor changes mid-flight — exec who funded the pilot moves; no successor inherits ownership.
Scope creep — pilot becomes the platform, never ships.

The fix: define production exit criteria, production owner, and production timeline at pilot inception. Pilots that stall failed at design time, not at execution time.

Most AI pilots stall in the same five ways. The fix is at design time, not at execution time.

The “AI pilot purgatory” pattern is widespread enough to have a name. Companies run AI pilots, the pilots produce interesting results, and nothing goes to production. Six months later, another pilot starts. The pattern isn’t an execution problem; it’s a design problem. This piece is what to fix at design time so the pilot actually ships.

The five stall patterns

1. No production owner

The pilot is sponsored by an exec, executed by a team, and lives in pilot land. When pilot completes successfully, no one is on the hook to put it into production. The exec is satisfied with the result; the team moves to the next pilot.

Fix: at pilot inception, name the production owner — the function lead or operator who will inherit the system in production. Give them veto on the pilot design (so the production handoff isn’t an afterthought).

2. Wrong success metric

Pilots measured on user satisfaction or accuracy alone. These are necessary but not sufficient — they don’t tell you whether the production version creates business value.

Fix: pilot success metrics should include business outcome (revenue, cost, conversion, cycle time) measured against control. Without this, the pilot can succeed on its own terms and still not justify production.

3. Path-to-production undefined

The pilot worked; what does production look like? Engineering ramp, integration with existing systems, supervision model, governance approval, compliance work — all undefined. The transition feels too daunting; the pilot becomes “the system” or quietly dies.

Fix: at pilot inception, design the path to production in detail. Specifically: what changes from pilot to prod, who owns each change, how long each change takes, what’s the go-live criterion.

Exec who sponsored the pilot leaves, gets reorged, or changes priorities. New exec doesn’t inherit ownership; the pilot loses its champion. Quietly stalls.

Fix: at pilot inception, identify both a sponsor and a backup sponsor (typically the operating function lead). The backup ensures continuity if the primary sponsor changes.

5. Scope creep

The pilot starts small but grows. New features, new use cases, new integrations. The pilot becomes “the platform” that’s perpetually 6 months from production. The growing scope makes the production transition more daunting, which makes the project even more pilot-like.

Fix: at pilot inception, fix the scope and the production date. New scope items go to v2, not pilot. Time-box ruthlessly.

The design checklist

Before pilot kickoff:

Production owner named, with veto on pilot design.
Sponsor and backup sponsor named.
Success metrics include a business outcome measured against control.
Path-to-production designed in detail, with named owners per change.
Production date set. (Realistic; defended.)
Scope frozen for the pilot. New items go to v2.
Go/kill decision criteria documented.

If any of these is missing at kickoff, the pilot is at risk of stalling.

What to do at pilot completion

Three decisions, made fast.

1. Go or kill. Don’t extend; don’t pause; don’t iterate forever. If success criteria met, go to production. If not met, kill.

2. If go: name the production date. Within 30 days of pilot completion. Don’t let the production transition become its own multi-month project.

3. Document what was learned. Even a kill teaches you something. Capture it for future pilots.

What sponsors should do differently

Three habits.

1. Sponsor pilots specifically, not “AI” generically. “I want to pilot AI” produces stalled pilots. “I want to pilot agent X for outcome Y” produces ship-able pilots.

2. Set the production expectation early. The team should know that the goal is production, not pilot. Different work culture; different decisions.

3. Don’t sponsor more pilots than the org can productionize. A typical mid-large enterprise can productionize 3–6 AI agents per year. If you’re sponsoring 20 pilots, most will stall regardless of design.

Counter: what about exploratory pilots?

Some pilots are genuinely exploratory — testing whether a use case is feasible at all. These have different success criteria.

For exploratory pilots:

Different label (“feasibility test”). Don’t conflate with production pilots.
Different timeline (4–8 weeks, not 4–6 months).
Different output (a decision document, not a system).
Different audience (the exec; not the operating function).

Exploratory work is valuable; just don’t confuse it with production-bound work.

What to do this quarter

Audit your AI pilots. Which are production-bound? Which are exploratory? Which are stalled?
For production-bound pilots: apply the design checklist. Fix the gaps.
For exploratory pilots: relabel and time-box. Don’t let them drift toward production.
For stalled pilots: kill or push to production decisively. Don’t let them sit.

FAQ

How long should a pilot last? Production-bound: 8–16 weeks. Exploratory: 4–8 weeks. Anything longer is at risk of becoming the system instead of a pilot.

What’s the right pilot-to-production ratio? ~50–70% of well-designed production-bound pilots should productionize. The rest should kill cleanly. Lower ratios suggest design problems; higher ratios suggest insufficient ambition.

Should we run pilots in production or in a sandbox? Pilots in real production, with limited scope (specific customer cohort, specific use case slice). Sandboxes don’t surface the integration issues that often kill production transitions.

What about CFO objections to investing before the pilot completes? The pilot’s purpose is the go/kill decision. Production investment is justified at pilot completion if the business case clears. Don’t try to fund production before pilot results; do plan production capacity.

How do we handle multiple stalled pilots from prior AI initiatives? Triage. Kill the ones that won’t go. Pick 1–2 to revive with the design checklist applied. Don’t try to revive all of them.

Working with JAIN on AI pilot design? We help executive teams design pilots that ship to production, not ones that stall. Book a 30-minute call.

Related reading: