Introducing Iceberg output for Redpanda Connect

From any source to any schema — lakehouse ingestion made simple (and boring)

March 5, 2026
Last modified on
TL;DR Takeaways:
No items found.
Learn more at Redpanda University

Apache Iceberg™ has become the table format that teams reach for when they want their streaming data to be queryable in the lakehouse. But for many, the "last mile" of this journey is where architectural complexity and hidden costs begin to pile up. Getting data there often means standing up heavyweight infrastructure and stitching together services just to move bytes from A to B. 

We wanted to make that a lot simpler.

Today we're announcing the Iceberg output for Redpanda Connect: a native component that writes streaming data to Iceberg tables directly from a declarative YAML pipeline. That means you can transform, enrich, and route streams before they land in your data lake, and query tables seconds later from any platform that supports Iceberg. See? Simple.

From stream to table, without the detours

If you're already running Redpanda, you might already be familiar with Iceberg Topics. They give you a zero-ETL path from broker to table that's streamlined for high-speed Kafka streams. Produce to a topic, and Redpanda handles the rest. For many workloads, that's all you need.

But maybe your data arrives from an HTTP webhook, a Postgres CDC stream, or a GCP Pub/Sub subscription. Maybe you need to normalize a payload, drop PII, or split a mixed event stream by type before anything hits the lakehouse. That's the gap this connector fills.

The Iceberg output plugs into Redpanda Connect's full ecosystem of 300+ inputs and processors. That means any source you can read from, you can now land directly into Iceberg tables with whatever transformations you need along the way.

Example pipeline

Here's a pipeline that reads events from a Redpanda topic, enriches each message with an ingestion timestamp, and routes them into per-type Iceberg tables:

input:
  redpanda:
    seed_brokers: ["${REDPANDA_BROKERS}"]
    topics: ["events"]
    consumer_group: "iceberg-sink"

pipeline:
  processors:
    - mapping: |
        root = this
        root.ingested_at = now()

output:
  iceberg:
    catalog:
      url: https://polaris.example.com/api/catalog
      warehouse: analytics
      auth:
        oauth2:
          client_id: "${CATALOG_CLIENT_ID}"
          client_secret: "${CATALOG_CLIENT_SECRET}"
    namespace: raw.events
    table: 'events_${!this.event_type}'
    storage:
      aws_s3:
        bucket: my-iceberg-data
        region: us-west-2

The value for seed_brokers uses Redpanda Cloud contextual variables that are available out-of-the-box in your environment. It's optional for the Redpanda input, but it is included above for clarity.

The table and namespace fields both support Bloblang interpolation, so a single pipeline can route messages to different tables based on content. Traditional Iceberg connectors often lead to "configuration hell," where every new table requires rigid mapping and brittle, manual updates. Suffer no more with Redpanda Connect!

Reshape your data before it lands

With Bloblang, you can:  

  • Reshape, filter, and enrich messages inline 
  • Flatten nested JSON into a columnar-friendly schema 
  • Strip sensitive fields before they reach the lakehouse
  • Derive new columns from existing ones 

It's all just a mapping processor in your pipeline block, running before the Iceberg output ever sees the message.

Your analysts get clean, query-ready tables. Your engineers get a single pipeline definition to maintain. No sidecar services, no separate Flink job.

Works with your catalog

The connector speaks the Iceberg REST Catalog API, so it works with the catalogs you're probably already running:

  • Apache Polaris™
  • AWS Glue Data Catalog
  • Databricks Unity Catalog
  • Snowflake Open Catalog
  • GCP BigLake
  • If your catalog speaks REST, you can point the connector at it.

Small in size. Big on benefits

How Redpanda Connect's new Iceberg output can help teams move quickly and efficiently, so you can spend more time building and analyzing, instead of moving and preparing data.

Less schema maintenance

While other connectors can technically evolve a schema, doing so without a schema registry usually forces you into "maintenance toil" (chaining brittle Kafka Connect SMTs) or leaves you with "dirty data" (where all columns land as string data types). Redpanda Connect gives you the best of both worlds: the flexibility of raw JSON with the precision of a structured lakehouse.

We handle cleaning, masking, and landing in a single pipeline. The Iceberg output also uses schema evolution to sense new fields in an incoming JSON stream and automatically updates the Iceberg table metadata. No manual DDL, no registry required, and no ticket for the ops team every time an app update adds a column.

Efficient at scale

Stop paying for quiet data sources and achieve greater resource density. Unlike legacy connectors that heartbeat on a fixed timer regardless of activity, Redpanda Connect uses data-driven flushing. It only executes a flush operation when there is actual data to move, preventing the "small file problem" on object storage and ensuring you aren't wasting compute cycles on empty operations.

Enterprise-grade governance

We speak security and isolation. Redpanda Connect fits into your existing OAuth2 token exchange and per-tenant REST catalog (like Polaris) workflows out of the box. And because Redpanda Connect is so lightweight (runs as low as 0.1 vCPU), you can deploy isolated, high-density pipelines for every tenant or department without blowing your cloud budget.

When to use Iceberg output

An overview of when to use the in-broker Redpanda Iceberg Topics integration or the Iceberg output in Redpanda Connect.

FeatureRedpanda Iceberg Topics (in-broker)Redpanda Connect Iceberg output (sink connector)
Primary valueZero-ETL performance. Lowest latency from stream to table.Integration flexibility. Route, transform, and automate in-stream before landing to tables.
Best forHigh-throughput, standard Kafka-to-lakehouse.Complex pipelines, non-Kafka sources, and "set-and-forget" schemas.
Data sourcesRedpanda Streaming topics only.Hundreds of sources (HTTP, CDC, SQS, Kinesis, etc.).
Schema evolutionRegistry-Driven. Evolves automatically as you update Avro/Protobuf/JSON schemas in the Schema Registry.Data-Driven. Table structure can evolve automatically from raw JSON—no registry required.
RoutingOptimized for 1 topic → 1 table.Multi-table. Route to many tables from one stream.
InfrastructureZero extra components.Stateless container (stateless pipeline) on K8s.
AvailabilityRedpanda Cloud BYOC or Self-Managed Enterprise Edition.Enterprise tier connector for Redpanda Connect (requires a license).

Getting started

The Iceberg output ships with Redpanda Connect v4.80.0. This initial release focuses on high-speed append-only ingestion (with upserts on the roadmap). 

Pull the latest from our Docs and write your first pipeline, then query your tables from the analytics engine of your choice.

Check out the full configuration reference for every field and option, including partition spec expressions, commit tuning, and batching configuration. Build your first pipeline today and start landing data on your own terms: stream, table, or all of the above!

No items found.

Related articles

View all posts
Towfiqa Yasmeen
,
Mike Broberg
,
&
Feb 3, 2026

Redpanda Serverless now Generally Available

Zero-ops simplicity meets enterprise-grade security to unlock production-ready data streaming for builders

Read more
Text Link
Jake Cahill
,
,
&
Jan 29, 2026

Bloblang playground just got smarter

Smarter autocomplete, dynamic metadata detection, and swifter collaboration

Read more
Text Link
Towfiqa Yasmeen
,
Mike Broberg
,
&
Nov 25, 2025

What's new in Redpanda Cloud: A simpler Serverless now on GCP, & more

Putting governed agents to work, plus a refreshed onboarding UX

Read more
Text Link
PANDA MAIL

Stay in the loop

Subscribe to our VIP (very important panda) mailing list to pounce on the latest blogs, surprise announcements, and community events!
Opt out anytime.