All resources Build vs Buy & Tooling

Choosing an Agent Framework

Three rules and the working pattern. The framework matters less than the architecture.

TL;DR

Three rules:

  1. Don’t over-commit to one framework early. Frameworks change fast; the cost of switching frameworks 12 months in is real.
  2. Prefer light frameworks plus custom code over heavy frameworks. Better long-term flexibility.
  3. Architecture matters more than framework. Clean separation between agent logic, tool integration, and infrastructure makes any framework workable.

The 2026 working pattern: most production agents use 2–3 specific libraries plus custom orchestration, not a single comprehensive framework.


Three rules and the working pattern. The framework matters less than the architecture.

The agent framework conversation gets stuck on which framework — LangChain, LlamaIndex, Pydantic AI, AutoGen, CrewAI, custom. The more important question is the architectural pattern. Companies that pick architecture first and framework second end up with maintainable systems; companies that go framework-first often hit limitations and have to rewrite. This piece is the working frame.

What’s specific to agent frameworks

Agent frameworks try to solve:

  • LLM API abstraction (call models, parse responses).
  • Tool integration (define and call tools).
  • Memory and state (conversation, long-term context).
  • Orchestration (sequence of LLM calls, control flow).
  • Multi-agent coordination.
  • Output parsing and validation.

Each framework takes a different stance on these. The differences matter for ergonomics and capability ceiling.

The major frameworks in 2026

LangChain

Pros: comprehensive, mature, ecosystem.

Cons: heavy, opinionated, can be hard to debug, frequent breaking changes historically.

When it works: standard patterns where the framework’s defaults match your needs.

When it doesn’t: complex custom logic that fights the framework.

LlamaIndex

Pros: strong for RAG and document-centric agents.

Cons: heavy for non-RAG patterns; some overlap with LangChain.

When it works: RAG-heavy use cases.

Pydantic AI

Pros: lightweight, type-safe, opinionated about clean architecture.

Cons: newer; smaller ecosystem.

When it works: production agents where type safety and clean architecture matter.

AutoGen / CrewAI

Pros: multi-agent orchestration.

Cons: opinionated about multi-agent patterns; can over-fit if your problem isn’t really multi-agent.

When they work: genuine multi-agent use cases.

Custom (no major framework)

Pros: maximum flexibility; no framework lock-in; small footprint.

Cons: you build patterns yourself; team must have the experience.

When it works: experienced AI engineering teams; specific architectural needs.

What most production agents actually use

The pattern at companies with mature AI engineering in 2026:

  • LLM SDK (OpenAI SDK, Anthropic SDK, or AI SDK from Vercel) for model calls.
  • Pydantic for structured output parsing.
  • Specific libraries for specific needs (Instructor for structured output, RAG-specific libraries when relevant).
  • Custom orchestration code for control flow.
  • Custom tool integration following a defined pattern.

This is less “framework” and more “set of libraries with custom orchestration.” Lighter than LangChain; more capable than out-of-the-box.

The architecture that matters

Three architectural decisions matter more than framework choice.

1. Clean separation of concerns

Agent logic separated from:

  • Tool integration code.
  • Eval/observability instrumentation.
  • Infrastructure (model routing, rate limiting).

Clean separation lets you swap frameworks; messy coupling locks you in.

2. Standardized tool interface

Tools should follow a single internal pattern (whatever you pick). Don’t let each agent define tools differently. Consistent tool interface enables sharing and observability.

3. Stateful vs stateless agents

Decide explicitly. Stateless agents (each call independent) are simpler and more reliable. Stateful agents (memory across calls) are more capable but more complex. Pick based on use case.

What to do this quarter

  1. Audit your agent codebase. What frameworks are in use? Is the architecture clean?
  2. If multiple frameworks are in use: standardize on a smaller set.
  3. If one heavy framework is in use: evaluate whether the architectural lock-in is acceptable.
  4. Document the pattern. Whatever framework approach, document it so new agents follow the same pattern.

Counter: don’t frameworks make us faster?

Sometimes. For a junior team, frameworks accelerate the first agent. For complex production systems, framework limitations can slow you down later.

The pattern: use frameworks early; refactor to lighter patterns as the team and use cases mature. Don’t lock in to a heavy framework if you can predict you’ll outgrow it.

FAQ

Will frameworks consolidate? Likely. Some consolidation has already happened (LangChain has acquired competitors). Expect 2–4 major frameworks plus the ecosystem of lighter libraries by 2027.

What about cloud-provider frameworks? Azure AI, Bedrock, Google Vertex provide framework offerings. Adoption mixed; vendor lock-in concern is real.

What about TypeScript / JavaScript frameworks? Vercel AI SDK is the leading TypeScript option. Strong choice for web-app-integrated agents. Some Python frameworks have JS counterparts.

Should we contribute to open-source frameworks? For active engineering teams, yes — both for influence and for the upskilling. Be selective; don’t contribute broadly.

How does this affect AI engineer hiring? Hire for engineering judgment, not framework experience. Strong engineers learn frameworks quickly; framework-only experience without engineering depth is a flag.


Working with JAIN on agent architecture? We help executive teams pick frameworks and design architectures that don’t lock in. Book a 30-minute call.

Related reading:

Want to talk through this for your team?

30 minutes, no slides. We'll work the specific call your company is facing.