Haystack Is Still Here, And It's Better Than You Remember
The 2.x rewrite turned a RAG framework into a quietly serious agent platform, and almost nobody talked about it.
The last RAG service my team shipped without an emergency rollback ran on Haystack. The pipeline was eleven components long. It validated its own wiring at build time, ran async under FastAPI without bespoke glue, exported OpenTelemetry traces to the same dashboard the rest of the platform used, and required exactly one engineer to maintain. Nobody on the team gave a conference talk about it. That is the entire point.
Haystack is the open-source framework from deepset, a German NLP company that has been building production search and QA systems since before “agent framework” was a phrase that meant anything. The 1.x line, which most engineers tried once around 2022 and then quietly stopped thinking about, was a monolithic toolkit with separate indexing and query pipelines, a REST surface that did not quite feel native, and abstractions that had grown by accretion rather than design. If your memory of Haystack ends there, the catch-up is real, because Haystack 2.0 in March 2024 was a ground-up rewrite, not a minor version bump. The framework that wears the same name in 2026 is a different framework, and pretending otherwise is the reason it gets skipped in tool roundups by engineers who should know better.
The architectural primitive in 2.x is the Component, and every Component is a Python class with typed inputs, typed outputs, and a single run method. A Pipeline is a directed graph of Components with the connections drawn explicitly between named sockets. When the pipeline is constructed, Haystack walks the graph and validates that every connection links a producer output to a consumer input of a compatible type. Misnamed sockets, type mismatches, missing components, all caught at build time rather than at the moment a user query hits the broken edge in production. This sounds like table stakes until you compare it to the agent frameworks that spent April defining themselves as “agnostic” by making every payload an unstructured dictionary.
The composition story is what makes the pipeline model age well. A retrieval pipeline is a graph of an embedder, a retriever, and a ranker. A RAG pipeline adds a prompt builder and a generator. An agent pipeline adds an Agent component that owns the tool-calling loop and routes back through retrieval steps as needed. The same Component contract holds across all of them, which means the eleven-component pipeline I shipped uses the same primitives as the three-component prototype that came before it. The framework does not change shape as the problem grows.
The Agent component is where the 2.x story becomes a 2026 story. deepset added it in Haystack 2.4, and the design is intentionally narrow: an Agent is a Component that wraps a generator, a tool list, and a control loop, with state passed through the pipeline graph the same way any other intermediate value would be. There is no separate agent runtime, no parallel execution model, no second framework hiding inside the first. If LangGraph’s pitch is “state machines for agents,” Haystack’s pitch is closer to “agents are one Component in a pipeline you already understand.” That framing trades flexibility for legibility, and the trade pays off the third time someone unfamiliar with the codebase has to debug a production incident at 11pm.
The integration surface is the part that makes Haystack a viable default rather than a niche pick. The document store list covers Elasticsearch, OpenSearch, Weaviate, Qdrant, Pinecone, Chroma, Milvus, pgvector, MongoDB Atlas, Astra DB, and a half-dozen others, all behind a uniform interface that lets you swap stores without rewriting the rest of the pipeline. The model integrations include OpenAI, Anthropic, Cohere, HuggingFace, AWS Bedrock, Vertex, Azure OpenAI, Ollama, vLLM, and the long tail of OSS inference endpoints. The deployment story is hayhooks, deepset’s pipeline-as-REST-service runner, or you write your own FastAPI wrapper around the pipeline object, which is a ten-line file. Nothing here is invented for the sake of being new. All of it is the boring thing that already works.
Where Haystack trails is exactly where you would expect a framework written by people who care about production to trail. The developer-experience polish is workmanlike rather than seductive. The pipeline-build error messages are accurate but rarely delightful. The visual builder in deepset Studio exists and is genuinely useful for non-engineering stakeholders, though the OSS package itself is a Python-first tool that assumes you read code more than diagrams. The documentation is thorough but dense, optimized for the engineer who needs the right answer, not the engineer who wants to feel inspired about agents in the next six weeks. None of this is a defect. It is a posture, and the posture is consistent.
The honest comparison to the frameworks that dominated this newsletter for the last two weeks comes down to what kind of problem you are actually solving. If the work is heavily retrieval-shaped, with documents, embeddings, rerankers, structured outputs, and a model call at the end, Haystack is the framework that was already built for that and has the most mature library of components for the supporting cast. If the work is heavily orchestration-shaped, with branching control flow, human-in-the-loop checkpoints, and multi-step recovery logic, LangGraph’s state-machine model maps to the problem more naturally. If the work is a structured multi-agent workflow with role specialization, CrewAI’s vocabulary fits. The categories overlap, though they overlap less than the marketing of any single framework wants to suggest. Picking Haystack for the agent-with-tools-and-retrieval case in 2026 is not a hedge against the hype cycle. It is the answer that was right before the hype cycle and stayed right through it.
The failure modes are worth naming honestly. The pipeline graph model is rigid in exactly the way it is supposed to be rigid, which means problems that genuinely require dynamic graph mutation at runtime do not fit cleanly. Agent loops inside Haystack work well when the tool surface and the control flow are bounded, and they get awkward when the agent needs to compose new sub-pipelines on the fly. The component contract assumes mostly Python, which means polyglot teams that have settled on TypeScript for the agent layer will find Mastra or the OpenAI Agents SDK a more natural fit. None of these are reasons to avoid Haystack. They are reasons to know which slot it fills, which is the production-RAG-and-bounded-agents slot, not the freeform-orchestration slot.
The reason Haystack belongs in this month’s framework rundown, and the reason it gets the kickoff slot for the “older stack holds up” arc, is the same reason teams ignore it and shouldn’t. The framework has a name that sounds like 2022. The 2.x architecture is a 2026-grade composition model with typed graphs, async execution, agent support, and a serving story. The gap between the perception and the reality is the kind of gap that careers get made in, because the engineer who has shipped two services on Haystack 2.x in the last year is not waiting for the next agent framework to stabilize. They are already shipping the third. Boring and ships is not a slogan. It is the most honest description of a tool that you will ever read in a blog post about it.
If this was useful, forward it to one engineer who needs less noise in their feed.


