
Redpanda Serverless now Generally Available
Zero-ops simplicity meets enterprise-grade security to unlock production-ready data streaming for builders
From any source to any schema — lakehouse ingestion made simple (and boring)
Apache Iceberg™ has become the table format that teams reach for when they want their streaming data to be queryable in the lakehouse. But for many, the "last mile" of this journey is where architectural complexity and hidden costs begin to pile up. Getting data there often means standing up heavyweight infrastructure and stitching together services just to move bytes from A to B.
We wanted to make that a lot simpler.
Today we're announcing the Iceberg output for Redpanda Connect: a native component that writes streaming data to Iceberg tables directly from a declarative YAML pipeline. That means you can transform, enrich, and route streams before they land in your data lake, and query tables seconds later from any platform that supports Iceberg. See? Simple.
If you're already running Redpanda, you might already be familiar with Iceberg Topics. They give you a zero-ETL path from broker to table that's streamlined for high-speed Kafka streams. Produce to a topic, and Redpanda handles the rest. For many workloads, that's all you need.
But maybe your data arrives from an HTTP webhook, a Postgres CDC stream, or a GCP Pub/Sub subscription. Maybe you need to normalize a payload, drop PII, or split a mixed event stream by type before anything hits the lakehouse. That's the gap this connector fills.
The Iceberg output plugs into Redpanda Connect's full ecosystem of 300+ inputs and processors. That means any source you can read from, you can now land directly into Iceberg tables with whatever transformations you need along the way.
Here's a pipeline that reads events from a Redpanda topic, enriches each message with an ingestion timestamp, and routes them into per-type Iceberg tables:
input:
redpanda:
seed_brokers: ["${REDPANDA_BROKERS}"]
topics: ["events"]
consumer_group: "iceberg-sink"
pipeline:
processors:
- mapping: |
root = this
root.ingested_at = now()
output:
iceberg:
catalog:
url: https://polaris.example.com/api/catalog
warehouse: analytics
auth:
oauth2:
client_id: "${CATALOG_CLIENT_ID}"
client_secret: "${CATALOG_CLIENT_SECRET}"
namespace: raw.events
table: 'events_${!this.event_type}'
storage:
aws_s3:
bucket: my-iceberg-data
region: us-west-2The value for seed_brokers uses Redpanda Cloud contextual variables that are available out-of-the-box in your environment. It's optional for the Redpanda input, but it is included above for clarity.
The table and namespace fields both support Bloblang interpolation, so a single pipeline can route messages to different tables based on content. Traditional Iceberg connectors often lead to "configuration hell," where every new table requires rigid mapping and brittle, manual updates. Suffer no more with Redpanda Connect!
With Bloblang, you can:
It's all just a mapping processor in your pipeline block, running before the Iceberg output ever sees the message.
Your analysts get clean, query-ready tables. Your engineers get a single pipeline definition to maintain. No sidecar services, no separate Flink job.
The connector speaks the Iceberg REST Catalog API, so it works with the catalogs you're probably already running:
How Redpanda Connect's new Iceberg output can help teams move quickly and efficiently, so you can spend more time building and analyzing, instead of moving and preparing data.
While other connectors can technically evolve a schema, doing so without a schema registry usually forces you into "maintenance toil" (chaining brittle Kafka Connect SMTs) or leaves you with "dirty data" (where all columns land as string data types). Redpanda Connect gives you the best of both worlds: the flexibility of raw JSON with the precision of a structured lakehouse.
We handle cleaning, masking, and landing in a single pipeline. The Iceberg output also uses schema evolution to sense new fields in an incoming JSON stream and automatically updates the Iceberg table metadata. No manual DDL, no registry required, and no ticket for the ops team every time an app update adds a column.
Stop paying for quiet data sources and achieve greater resource density. Unlike legacy connectors that heartbeat on a fixed timer regardless of activity, Redpanda Connect uses data-driven flushing. It only executes a flush operation when there is actual data to move, preventing the "small file problem" on object storage and ensuring you aren't wasting compute cycles on empty operations.
We speak security and isolation. Redpanda Connect fits into your existing OAuth2 token exchange and per-tenant REST catalog (like Polaris) workflows out of the box. And because Redpanda Connect is so lightweight (runs as low as 0.1 vCPU), you can deploy isolated, high-density pipelines for every tenant or department without blowing your cloud budget.
An overview of when to use the in-broker Redpanda Iceberg Topics integration or the Iceberg output in Redpanda Connect.
The Iceberg output ships with Redpanda Connect v4.80.0. This initial release focuses on high-speed append-only ingestion (with upserts on the roadmap).
Pull the latest from our Docs and write your first pipeline, then query your tables from the analytics engine of your choice.
Check out the full configuration reference for every field and option, including partition spec expressions, commit tuning, and batching configuration. Build your first pipeline today and start landing data on your own terms: stream, table, or all of the above!

Zero-ops simplicity meets enterprise-grade security to unlock production-ready data streaming for builders

Smarter autocomplete, dynamic metadata detection, and swifter collaboration

Putting governed agents to work, plus a refreshed onboarding UX
Subscribe to our VIP (very important panda) mailing list to pounce on the latest blogs, surprise announcements, and community events!
Opt out anytime.