
Streamfest day 2: Smarter streaming in the cloud and the future of Kafka
Highlights from the second day of Redpanda Streamfest 2025
A look into our refactored scheduler and executor
Welcome to the Engineering Den, a new series where our engineers give you a quick peek under the hood at how they’re upgrading the Redpanda Streaming engine and Agentic Data Plane. (And yes, it’s called Engineering Den because pandas live in dens.)
First, some context. Back in October, Redpanda launched the Agentic Data Plane: a unified access layer that securely connects AI agents to enterprise data and systems. To help enable that governed data access, Redpanda acquired Oxla—a distributed SQL Engine and database, which provides a standard SQL endpoint for AI agents to query data in motion or at rest.
While Oxla is now part of Redpanda, the Oxla team never stopped working on the engine. Lately, we've been working on a new query manager, which is a component responsible for the lifecycle of currently-running queries. This was motivated by some very real pain with the old query manager, and we soon realized that we needed an approach that's better suited for the scale we'll face in the near future. Our primary objective: improving the stability of the clusters when scheduling, cancelling, or restarting queries.
So instead of trying to patch around it again, we refactored the whole thing.
The main issue was state management. Queries could get stuck in “finished” or “executing” while still holding onto resources. Different parts of the system disagreed about what was actually happening. A query might show as scheduled in one place and finished in another. From the outside, it looked like things were running. Under the hood, it was a mess.
Cancellation was especially rough. To avoid deadlocks, the old code gathered running queries, spawned async work per thread, and sometimes had to retry cancellation from a different thread entirely. That approach had already caused problems in the past, and it made the system hard to reason about and harder to debug.
Here's what we did and what changed.
The new scheduler is built as a deterministic state machine. At any point, it’s in a known state, handling a specific event, and transitioning predictably. Every transition is logged. That means when something goes wrong, you can look at the logs and see exactly where the scheduler was and what it was doing. There’s no ambiguity about whether a query is running, scheduled, canceled, or done. The system always knows, and you can see it.
Cleanup is also explicit now. When a query finishes, the scheduler and executors are torn down cleanly. Finished queries stay finished. Canceled queries are accounted for. Nothing hangs around quietly consuming resources anymore.
One of the biggest improvements showed up during development itself. Bugs still happened, as they always do with new code, but they were much easier to track down. Being able to trace state transitions made fixes straightforward instead of exploratory.
We’ve run around 25,000 queries on one- and three-node clusters with the new implementation and haven’t seen the kinds of issues that were common before. No stuck queries, no confusing state, no guessing what the system thinks is happening.
In practical terms, the new implementation is already more reliable than the old one. And, with this new scheduler, we’re now much better prepared for scale by building confidence with large node number clusters. So, the TL;DR of this post is:
Problems with the old query manager
Benefits of the new scheduler/executor
We expect to have this in production within days. We also have a few more improvements in flight, so make sure to subscribe to the blog or join the Redpanda Community Slack to chat with us directly.
Chat with our team, ask industry experts, and meet fellow data streaming enthusiasts.

Highlights from the second day of Redpanda Streamfest 2025

Highlights from the first day of Redpanda Streamfest 2025

Build MCP servers with a single YAML, securely connect data to AI apps, and let Redpanda Cloud manage the rest
Subscribe to our VIP (very important panda) mailing list to pounce on the latest blogs, surprise announcements, and community events!
Opt out anytime.