LlamaIndex Quietly Became an Agent Framework
The pivot from RAG library to agent runtime is real, and the Workflow abstraction is why.
Two years ago I used LlamaIndex to do one thing: index a directory of policy PDFs and answer questions against them. The query engine worked. The retriever worked. I didn’t think about it again. Last month I opened the latest docs to upgrade that pipeline and found a framework I barely recognized.
The README now describes LlamaIndex as the leading framework for building LLM-powered agents over your data. Two years ago it said something closer to a data framework for LLM applications. The shift is real. Workflows now sits at the center of the architecture. Agents are described as workflows. RAG pipelines are described as workflows. The thing that used to be a document indexing library is now an event-driven orchestration engine that happens to ship with the best document loaders in the ecosystem.
This is the kind of pivot vendors usually botch. The old API gets bolted onto the new one with adapter shims, the documentation forks, the community splits between people who learned the old way and people learning the new way. LlamaIndex avoided most of that. The Workflow primitive is a better abstraction than the Agent class it largely replaced, and the team had the discipline to make the migration path obvious instead of hiding it behind marketing copy.
Workflows are event-driven steps connected by typed messages. A step receives an event, does some work, and emits another event. The runtime handles routing, retries, and parallel execution. If that sounds like Temporal or Inngest with an LLM-flavored wrapper, that’s roughly right. The difference is that the events carry LLM-relevant context cleanly and the step decorators handle async-first execution without ceremony. You write a step like this:
@step
async def retrieve(self, ev: QueryEvent) -> RetrievalEvent:
nodes = await self.retriever.aretrieve(ev.query)
return RetrievalEvent(nodes=nodes, query=ev.query)Three things matter here. The step is typed in both directions, so the workflow engine can validate that your graph actually wires together before any model call happens. The step is async, so I/O-heavy work runs without blocking. The event is a real object, not a dict, which means refactoring across steps doesn’t degrade into find-and-replace across keys you hoped you spelled the same everywhere.
This is the part that pulled me back in. Most agent frameworks I evaluated last quarter either pretended async didn’t exist or required me to wrap their synchronous APIs in thread pools to recover throughput. LlamaIndex Workflows assumes async. The agent loop is a workflow where one step calls a model and another step routes based on tool calls. There is no separate agent execution model fighting the rest of your code.
The honest comparison is against LangGraph, which is the framework people reach for when they want graph-based agent orchestration. LangGraph’s state-machine model is more rigorous. You declare nodes and edges explicitly, the state object is shared and reducible, and the supervisor pattern for multi-agent setups is more developed. LlamaIndex Workflows trades some of that rigor for ergonomics. The event-driven model feels closer to writing normal Python and less like configuring a state machine. For a small team that needs to ship something maintainable, this matters more than the theoretical purity of the abstraction.
The pivot does have rough edges. The legacy Agent and AgentRunner classes still exist in the codebase, still appear in older tutorials, and still work, which means new practitioners hit the documentation and have to figure out which abstraction is the current one. The team has moved hard toward AgentWorkflow as the canonical pattern, but the older surface area hasn’t been deprecated cleanly. If you’re starting today, ignore everything that doesn’t say workflow and you’ll save yourself a week.
The other rough edge is the multi-agent story. AgentWorkflow handles handoffs and shared state between agents, but the patterns are less mature than what LangGraph or CrewAI offer. If your use case is one agent with a real toolset and a real retrieval layer, LlamaIndex is at or near the top of the lineup. If your use case is a coordinated team of specialist agents with handoffs and shared scratchpads, you’ll spend more time reinventing what other frameworks give you out of the box.
What didn’t change is the data layer, and this is the actual reason to pick LlamaIndex over a more abstract framework. LlamaHub still has the largest catalog of document loaders, retrievers, and storage integrations in the ecosystem. LlamaParse handles the kind of PDFs that break every other parser I’ve tried, including the ones with three-column layouts and embedded tables that other libraries reduce to noise. If your agent needs to work with documents rather than chat about them, you start every other framework at a deficit because you’ll be reimplementing pieces LlamaIndex already shipped two years ago.
The pivot is real. It’s not naming. The Workflow abstraction is an architectural commitment, not a marketing gesture. The team didn’t paint agents on the side of the truck and call it new. They built an event-driven runtime, made it the core, and rewrote the agent patterns to sit on top of it. That’s the move I want to see when a tools company adapts to a new shape of work, and few teams have managed it without breaking their users along the way.
Where this lands in production: I’d use LlamaIndex today for any agent whose primary job is reasoning over documents or structured retrieval, particularly anything where the data layer matters more than the orchestration layer. For pure tool-using agents with no retrieval involved, I’d still reach for Pydantic AI when type safety dominates or LangGraph when explicit state machines do. The choice isn’t about which framework is best, it’s about which gap in your system matters most. LlamaIndex is the right answer when the gap is data.
The library that started as RAG infrastructure didn’t pretend to become something else. It built the orchestration layer it always needed, made workflows the spine, and kept the data integrations its users depended on. That’s the rarer transition. Most frameworks pivot by abandoning what made them useful. This one pivoted by building outward from it.
If this was useful, forward it to one engineer who needs less noise in their feed.


