The AI Stack Most Enterprises Should Run
The reference stack for mid-large enterprises in 2026. Heavy buy at the bottom, build for differentiation at the top.
TL;DR
The reference enterprise AI stack in 2026:
| Layer | Components | Build/Buy |
|---|---|---|
| Foundation models | Multi-model (closed + open) | Buy |
| Model gateway | Routing, auth, rate limit | Buy |
| Tool catalog | MCP-compatible | Build |
| Agent framework | Light libraries + custom orchestration | Mostly buy |
| Eval and observability | Vendor platforms | Buy |
| Audit and governance | Specialized + custom | Mix |
| Productivity AI | Vendor products | Buy |
| Customer-facing AI | Wedge-specific (build differentiating, buy commodity) | Mix |
| Industry-specific AI | Vendor products + custom | Mix |
The shape: heavy buy at infrastructure layers, custom work concentrated at differentiating capabilities and tool catalog.
The reference stack for mid-large enterprises in 2026. The shape: heavy buy at the bottom, build for differentiation at the top.
The “what AI stack should we run” question gets answered differently by every consultant. The 2026 working pattern at companies executing AI well is consistent enough to describe. This piece is the reference architecture with the build/buy distribution.
The reference stack
Layer 1: Foundation models
Multi-model from day one. Mix of:
- 2–3 frontier closed models (Claude, GPT, Gemini).
- 1–2 open models for specific use cases.
- Specialized smaller models for narrow tasks (embeddings, classification).
Cost: scales with usage. For typical mid-large enterprise: $1M–$10M annually in foundation model spend.
Layer 2: Model gateway and routing
The layer that connects agents to foundation models. Functions:
- Authentication.
- Rate limiting and cost management.
- Model routing per use case.
- Fallback handling.
- Usage tracking.
Buy: LiteLLM, OpenRouter, Vercel AI Gateway, or cloud-native (Bedrock, Azure AI Studio).
Cost: $50K–$500K annually depending on scale and vendor.
Layer 3: Tool catalog and MCP infrastructure
Tools that agents can invoke. MCP-compatible architecture for future-proofing.
Build: most tool integrations. Internal data and systems are specific.
Buy: MCP server frameworks; vendor MCP servers for common systems.
Cost: 1–3 FTEs to build and maintain catalog.
Layer 4: Agent framework and orchestration
The framework agents are built in. The 2026 working pattern: light libraries + custom orchestration, not heavy frameworks.
Buy: model SDKs (OpenAI, Anthropic SDKs), Pydantic, structured-output libraries.
Build: custom orchestration following internal patterns.
Cost: included in AI engineering team capacity.
Layer 5: Eval and observability
Eval platform plus production observability. Often combined vendor offering.
Buy: Braintrust, Langfuse, LangSmith, or similar.
Cost: $100K–$500K annually for typical enterprise.
Layer 6: Audit and governance
Audit logging + governance tooling. Specific to your regulatory and operational needs.
Mix: vendor for audit platform (or build for specialized needs); custom integration with internal governance processes.
Cost: 1–2 FTEs operating; $50K–$200K platform cost.
Layer 7: Productivity AI
Tools your employees use for productivity. Coding assistants, writing assistants, research tools.
Buy: vendor products. Don’t build.
Cost: $20–$50/user/month for major productivity AI tools. For 10K-employee enterprise: $2.4M–$6M annually.
Layer 8: Customer-facing AI
AI in your products. Mix of differentiating (build) and commodity (buy).
Build: the 1–3 wedges that differentiate your product.
Buy: commodity capabilities (customer support, generic personalization, generic search).
Cost: highly variable; dominant cost item for AI-active enterprises.
Layer 9: Industry-specific AI
For regulated and specialized use cases (clinical AI, banking-specific AI, etc.).
Mix: industry vendor products + custom integration + governance overlay.
Cost: industry-specific.
The total cost picture
For a typical 10K-employee mid-large enterprise in 2026:
| Layer | Annual cost (range) |
|---|---|
| Foundation models | $1M–$10M |
| Gateway/routing | $50K–$500K |
| Tool catalog | $300K–$1M (people) |
| Agent framework | included |
| Eval/observability | $100K–$500K |
| Audit/governance | $200K–$700K |
| Productivity AI | $2M–$6M |
| Customer-facing AI | $2M–$20M+ |
| Industry-specific AI | varies |
| AI engineering team | $3M–$15M |
Total annual AI spend: $10M–$50M+ for a mid-large enterprise. Scales with strategic ambition and customer-facing AI investment.
What’s commonly wrong
Three mistakes I see often.
Mistake 1: Heavy investment in custom infrastructure
Building custom gateways, custom eval platforms, custom observability. Vendor solutions are mature; building reproduces commodity capability.
Mistake 2: Light investment in tool catalog
Treating tool integration as one-off engineering work rather than strategic infrastructure. Result: agents can’t access the tools they need; new agent development is slow.
Mistake 3: Single framework lock-in
Heavy commitment to one agent framework that becomes a constraint as use cases mature.
What to do this quarter
- Audit your stack against the reference. Where does your stack diverge?
- Identify infrastructure over-investment. Custom builds that should be replaced by vendor.
- Identify tool catalog under-investment. Probably the biggest gap at most enterprises.
- Plan the rebalance. Migrations take 12–18 months.
How the stack will evolve through 2027
Three predictions.
1. Vendor consolidation at infrastructure layers. 2–4 major vendors per layer; less fragmentation.
2. Standardization of MCP and tool catalog. Tool ecosystem stabilizes; build effort focuses on internal tools.
3. Open models in more layers. Self-hosted or third-party-hosted open models in routing alongside closed.
FAQ
What about RAG-specific infrastructure? Vector databases, retrieval frameworks. Significant for some use cases; lighter for others. Buy from established vendors (Pinecone, Weaviate, Qdrant, etc.) or use cloud-native (Vertex, Bedrock vector services).
What about fine-tuning infrastructure? For most enterprises, foundation model providers’ fine-tuning offerings are sufficient. Self-hosted fine-tuning infrastructure rarely justified.
How does this differ for AI-native vs. traditional companies? AI-native companies build deeper at most layers. Traditional companies should buy more aggressively. The reference is calibrated to traditional enterprise.
What about regulated industries? Add compliance overlays at audit/governance and industry-specific layers. May reduce buy options at some layers.
How does this map to cloud-provider AI offerings? Cloud-provider offerings cover several layers (gateway, eval in some cases, foundation models). Use selectively; don’t lock everything to one cloud.
Working with JAIN on AI stack architecture? We help executive teams design and execute the reference stack with appropriate customization. Book a 30-minute call.
Related reading:
Want to talk through this for your team?
30 minutes, no slides. We'll work the specific call your company is facing.