Soft pastel gradient blending from warm peach to cool blue

Redpanda SQL

Query everything. Including the last millisecond.

Analyze data in both Redpanda topics and Apache Iceberg tables in a simple SQL statement. No data movement. No pipelines. No waiting.

Request a demoRead the docs

Your analytics tools are constantly playing catch-up.

The pipeline always has a price.

Before your analysts can query streaming data, someone has to build the ETL pipeline, configure the Kafka Connect sink, deal with schema mapping, and own it when it breaks. That's the pipeline price: weeks of engineering overhead before anyone touches a dataset.

Two systems. Zero unified view.

Keeping streaming and lakehouse data aligned means checkpointing, gap detection after job failures, duplicate handling, and ongoing job monitoring. Two systems don't just complicate queries. They create operational toil and a tax to pay before analysis can start.

Nobody likes the debugging workaround.

Debugging a streaming app or chasing a data quality issue means either writing code, or exporting topic data to a file and parsing through it by hand. Every engineer who's done it has thought the same thing: this could have been a SQL query.

One query. Live real-time topics + Iceberg history. No mess.

Redpanda SQL is a Postgres-compatible analytical query engine that reaches across Redpanda streams and your existing lakehouse, without moving a single byte outside your network.

SQL your team already knows.

Redpanda SQL speaks the Postgres SQL dialect and wire protocol. Almost any standard SQL client connects without modification: psql, DBeaver, DataGrip. No unfamiliar query syntax or semantics.

No pipeline. No lag.

Redpanda SQL queries data in place. No ETL. No connector to babysit. No manual deduplication. Results from data that arrived milliseconds ago seamlessly fused with last year’s data.

Stays inside your network.

Redpanda SQL runs inside your Bring Your Own Cloud (BYOC) cluster, on the same infrastructure as your Redpanda Streaming brokers. Every query executes in place. Nothing leaves your perimeter.

Redpanda SQL queries data in place. No ETL. No connector to babysit. Results from data that arrived milliseconds ago.

Redpanda SQL does four jobs your warehouse can't do.

Operational analytics

Query what's happening now alongside the full historical record. Immediate data freshness. One SQL statement. No pipelines. No data movement.

End-to-End governance & compliance

Regulated data that can't leave your environment can now be queried inside your VPC with BYOC deployment. No separate SaaS provider. No data egress.

Topic debugging

Malformed events, missing fields, poison pills — skip the scripts. Query your live topics with SQL and diagnose the culprit in your code in minutes, not hours.

AI agent workloads

Agents fan out across data sources, issuing hundreds of SQL queries in parallel. Feed them live data, not an old snapshot. Their answers match reality.

The Redpanda Data Platform

Stream, connect and query. One vendor.

Three products. One architecture. One operational model. One vendor.

Stream

GBs/sec throughput, built in C++. The substrate for every Redpanda service.

Connect

Hundreds of connectors. Filter, enrich, route, and transform messages between any two systems.

Query

Ad-hoc SQL across live topics and Iceberg history. No pipeline required.

The pipeline approach vs. Redpanda SQL

Redpanda SQL is not a replacement for your existing analytics stack. Snowflake stays. Databricks stays. Redpanda SQL covers what they can't reach.

Red panda wearing a shirt typing SQL query on vintage computer in dimly lit room at night.

What You Need	The Pipeline Approach	Redpanda SQL
Data freshness	1–10 min lag from pipeline ingest	Milliseconds fresh, query data as it arrives
Coverage	Cold tier only (what made it to the warehouse/lakehouse)	Hot + cold unified, live topics and full Iceberg history
Pipeline burden	Build, deploy, monitor, maintain a connector or ETL/streaming job	No pipeline required, query directly
Analyst access	Requires pipeline to be built first	Almost any Postgres client, accessible with familiar PostgreSQL
AI agent query model	Batch-oriented / stale data, or bulky, per-query streaming jobs	Hundreds of parallel SQL queries against live data
Data governance	Data must egress to an external SaaS provider	Query in place, within your VPC

FAQ: Redpanda SQL

What is Redpanda SQL?

Redpanda SQL is a Postgres-compatible analytical query engine that runs inside your Redpanda BYOC cluster. It gives you direct SQL access to live data in your Apache Kafka-compatible Redpanda Streaming topics and historical data in Iceberg tables, combined in a single table/query, without pipelines or new infrastructure to provision. It performs a single federated query across both and combines the results seamlessly and accurately. Connect using almost any standard Postgres client and start querying immediately.

How is this different from querying data in my warehouse or lakehouse?

Warehouses can only query data that's already been explicitly ingested, which requires a data engineer and typically adds 1–10 minutes of pipeline lag (or longer, for some infrequent batching jobs). Redpanda SQL automatically and transparently queries live streaming data and historical Iceberg data in a single, simple statement. No data engineering, complicated joins or union logic. It covers what your warehouse can't reach.

Do I need to set up pipelines or materialized views before I can query?

No. Redpanda SQL is fully ad-hoc: connect your Postgres client and query any topic, any time, with no pipelines or pre-defined views. This separates Redpanda SQL from tools like ksqlDB, which creates separate materialized views and requires additional infrastructure.

What SQL clients work with Redpanda SQL?

Redpanda SQL uses the Postgres wire protocol, so almost any standard Postgres-compatible SQL client connects without modification: psql, DBeaver, DataGrip, and most analytics tooling that supports the Postgres interface. No proprietary drivers, no new SDKs, no obscure query syntax or semantics (like with FlinkSQL), and no special configuration required.

Is Redpanda SQL a replacement for Snowflake or Databricks?

No, Redpanda SQL is additive. Snowflake and Databricks handle the workloads they're designed for. Redpanda SQL covers what they can't: streaming data still in your topics before lakehouse ingestion, queried inside your own VPC without egress charges or compliance issues.

What are Iceberg Topics and do I need them for every use case?

Iceberg Topics is a Redpanda Streaming feature that writes streaming data to both a hot tier (the live Redpanda topics on local storage) and a cold tier (Parquet/Iceberg in S3 or GCS). At this time, Redpanda SQL queries Iceberg tables written by Redpanda Iceberg Topics specifically — not external Iceberg tables populated by other tools. Standard Redpanda topics work fine without Iceberg for SQL use cases like topic debugging and real-time analytics.

What technology powers Redpanda SQL?

Redpanda SQL is not built on PostgreSQL. It's built on the MPP (Massively Parallel Processing) engine technology that Redpanda acquired from Oxla, written from the ground up in C++ including many novel query engine innovations. The engine speaks the Postgres wire protocol natively, which is why standard clients like psql and DBeaver connect without modification.

It's designed for large-scale analytical workloads with strong data compression and cost-efficient scaling, and is managed entirely by Redpanda — no separate Oxla relationship or additional configuration required. The Oxla engine's novel ring-buffer streaming shuffle allows it to maximize CPU usage, especially at large core counts. Its design is described in a peer-reviewed paper, conditionally accepted for publication at VLDB 2026, one of the premier academic conferences for database systems research.

Can I use Postgres extensions with Redpanda SQL?

No. Since Redpanda SQL is only Postrgres wire-compatible, but does not otherwise share the same Postgres code base, you cannot employ extensions with it, such as PostGIS or pgvector.

Is Redpanda SQL available for self-managed, Azure, or GCP deployments?

Redpanda SQL is generally available today on Redpanda BYOC on AWS. GCP support is coming soon. Self-managed / on-premises deployment is targeted for 2H FY2027. Reach out to your Redpanda account team for current availability or to join the waitlist for upcoming deployment options, including Azure.

Get Started

Your data is already there. Now you can query it.

Activate SQL on your existing cluster: no new infrastructure, no new vendor relationship, usage-based billing. Redpanda SQL is available now on Redpanda BYOC.

Start with Redpanda BYOC Read the docs