Pandaproxy: Bringing Kafka to the masses

Pandaproxy allows access to Redpanda via a REST API.

April 27, 2021
Last modified on
TL;DR Takeaways:
How can I consume messages in Redpanda?

To retrieve messages in Redpanda, send a GET to the /consumers/{consumer_group}/instances/{consumer}/records endpoint. To consume JSON encoded messages, you need to specify the Accept header: Accept: application/vnd.kafka.json.v2+json.

How can I produce to a topic in Redpanda?

To produce to a topic in Redpanda, you need to POST a list of records to the /topics/{topic} endpoint. JSON and base64 encoded payloads are currently supported, specified with a Content-Type of application/vnd.kafka.json.v2+json or application/vnd.kafka.binary.v2+json, respectively.

How can I start using Redpanda?

You can start using Redpanda by installing it and ensuring it's up to date. You can follow the instructions in the Linux, MacOS, Kubernetes, or Docker quick start guides. Alternatively, you can sign up for Redpanda Cloud to have the infrastructure managed for you.

How do I create a consumer in Redpanda?

In Redpanda, consumers belong to a consumer group. To create a consumer, you need to POST the consumer configuration to the /consumers/{consumer_group} endpoint. The standard protocol for interacting with the REST API is specified with: Content-Type: application/vnd.kafka.v2+json.

What is Pandaproxy in Redpanda?

Pandaproxy is a new subsystem of Redpanda that allows access to your data through a REST API. It enables you to use your favorite HTTP CLI or library to produce and consume a stream of events.

What is Redpanda?

Redpanda is a high-performance, Kafka API-compatible streaming data platform designed for modern applications requiring low latency and high throughput. Built from the ground up in C++, Redpanda eliminates the complexity of traditional streaming platforms by providing a single binary with no external dependencies like JVM or Zookeeper. It offers complete Kafka API compatibility, allowing existing Kafka applications to run without modifications while delivering up to 10x better performance. Redpanda features a thread-per-core architecture that ensures predictable, consistent performance without garbage collection pauses. The platform includes built-in essentials like Schema Registry, HTTP Proxy, and native Tiered Storage for cost-effective data retention. With automatic partition balancing, WebAssembly support for inline transforms, and end-to-end encryption, Redpanda simplifies operations while maintaining enterprise-grade reliability. It's designed for teams who want Kafka's capabilities without its operational complexity, offering reduced infrastructure costs, simplified deployment, and consistent sub-millisecond latencies even under heavy load.

Learn more at Redpanda University

At Redpanda, we like to make things simple. Redpanda is an Apache Kafka®-compatible event streaming platform that eliminates ZooKeeper™ and the JVM, autotunes itself for modern hardware, and ships in a single binary.

There are many high-quality Kafka clients for lots of languages, but wouldn't it be nice if you could just fire up your favorite HTTP CLI or library to produce and consume a stream of events?

Well, we're pleased to announce Pandaproxy, a new subsystem of Redpanda that allows access to your data through a REST API!

It's already available in Redpanda, so if you have Redpanda installed, make sure it's up to date. If not, follow the instructions in the  Linux, MacOS, Kubernetes, or Docker quick start guides

If you want to leave the infrastructure issues to us, sign up for Redpanda Cloud for the simplest way to run Redpanda.

Example: produce and consume

Let's jump right in and start Redpanda using Docker on Linux:

docker network create redpanda
docker volume create redpanda
docker run \
  --pull=always \
  --name=redpanda \
  --net=redpanda \
  -v "redpanda:/var/lib/redpanda/data" \
  -p 8082:8082 \
  -p 9092:9092 \
  --detach \
  vectorized/redpanda start \
  --overprovisioned \
  --smp 1 \
  --memory 1G \
  --reserve-memory 0M \
  --node-id 0 \
  --check=false \
  --pandaproxy-addr 0.0.0.0:8082 \
  --advertise-pandaproxy-addr 127.0.0.1:8082 \
  --kafka-addr 0.0.0.0:9092 \
  --advertise-kafka-addr redpanda:9092

docker network create redpanda
docker volume create redpanda
docker run \
 --pull=always \
 --name=redpanda \
 --net=redpanda \
 -v "redpanda:/var/lib/redpanda/data" \
 -p 8082:8082 \
 -p 9092:9092 \
 --detach \
 vectorized/redpanda start \
 --overprovisioned \
 --smp 1 \
 --memory 1G \
 --reserve-memory 0M \
 --node-id 0 \
 --check=false \
 --pandaproxy-addr 0.0.0.0:8082 \
 --advertise-pandaproxy-addr 127.0.0.1:8082 \
 --kafka-addr 0.0.0.0:9092 \
 --advertise-kafka-addr redpanda:9092

Create the topic my_topic :

docker run \
  --net=redpanda \
  vectorized/redpanda \
  --brokers=redpanda:9092 \
  topic create my_topic \
  --partitions=3 \
  --replicas=1

docker run \
 --net=redpanda \
 vectorized/redpanda \
 --brokers=redpanda:9092 \
 topic create my_topic \
 --partitions=3 \
 --replicas=1

Now we're ready to start using Pandaproxy!

Endpoints are documented with Swagger at http://localhost:8082/v1.

I'm using jq to prettify and process the JSON responses.

We'll use the popular requests module (pip install requests).

For the rest of the guide, we'll assume the following for an interactive python session:

import requests
import json
def pretty(text):
  print(json.dumps(text, indent=2))

base_uri = "http://localhost:8082"

import requests
import json
def pretty(text):
 print(json.dumps(text, indent=2))

base_uri = "http://localhost:8082"

List topics

  • Curl
  • Python
res = requests.get(f"{base_uri}/topics").json()
pretty(res)
curl -s "localhost:8082/topics" | jq .

curl -s "localhost:8082/topics" | jq .

[
  "my_topic"
]

[
 "my_topic"
]

Produce to a topic

We need to POST a list of records to the /topics/{topic} endpoint.

JSON and base64 encoded payloads are currently supported, specified with a Content-Type of application/vnd.kafka.json.v2+json or application/vnd.kafka.binary.v2+json, respectively. We'll use JSON:

  • Curl
  • Python

curl -s \
 -X POST \
 "http://localhost:8082/topics/my_topic" \
 -H "Content-Type: application/vnd.kafka.json.v2+json" \
 -d '{
 "records":[
     {
         "value":"Vectorized",
         "partition":0
     },
     {
         "value":"Pandaproxy",
         "partition":1
     },
     {
         "value":"JSON Demo",
         "partition":2
     }
 ]
}' | jq .

{
 "offsets": [
   {
     "partition": 0,
     "offset": 0
   },
   {
     "partition": 1,
     "offset": 0
   },
   {
     "partition": 2,
     "offset": 0
   }
 ]
}

If a partition is not specified, one is chosen based on a murmur2 hash of the key. If there is no key, partitions are chosen using a round-robin strategy.

Create a consumer

Consumers belong to a consumer group. If you have many consumers in a group, messages are distributed between all consumers.

The standard protocol for interacting with the REST api is specified with: Content-Type: application/vnd.kafka.v2+json.

We need to POST the consumer configuration to the /consumers/{consumer_group} endpoint, let's call the consumer my_consumer and call the consumer group my_group, with format = json:

  • Curl
  • Python

curl -s \
 -X POST \
 "http://localhost:8082/consumers/my_group"\
 -H "Content-Type: application/vnd.kafka.v2+json" \
 -d '{
 "format":"json",
 "name":"my_consumer",
 "auto.offset.reset":"earliest",
 "auto.commit.enable":"false",
 "fetch.min.bytes": "1",
 "consumer.request.timeout.ms": "10000"
}' | jq .

{
 "instance_id": "my_consumer",
 "base_uri": "http://127.0.0.1:8082/consumers/my_group/instances/my_consumer"
}

We'll need the base_uri for further interaction with the consumer, as it identifies the consumer, but also the particular Pandaproxy instance if we're connecting to a Redpanda cluster.

Subscribe the consumer

Consumer groups listen to topics. To subscribe a consumer group to a list of topics, subscribe any of the consumers in the group to the topics.

We need to POST a list of topics for the /consumers/{consumer_group}/instances/{consumer}/subscription endpoint.

Subscribe my_consumer to topic my_topic:

  • Curl
  • Python

curl -s -o /dev/null -w "%{http_code}" \
 -X POST \
 "http://localhost:8082/consumers/my_group/instances/my_consumer/subscription"\
 -H "Content-Type: application/vnd.kafka.v2+json" \
 -d '{
 "topics": [
    "my_topic"
 ]
}'

Consume messages

To retrieve messages, send a GET to the /consumers/{consumer_group}/instances/{consumer}/records endpoint.

We'll consume JSON encoded messages, so we have to specify the Accept header: Accept: application/vnd.kafka.json.v2+json.

  • Curl
  • Python

curl -s \
 "http://localhost:8082/consumers/my_group/instances/my_consumer/records?timeout=1000&max_bytes=100000"\
 -H "Accept: application/vnd.kafka.json.v2+json"  | jq .

[
 {
   "topic": "my_topic",
   "key": null,
   "value": "Vectorized",
   "partition": 0,
   "offset": 0
 },
 {
   "topic": "my_topic",
   "key": null,
   "value": "Pandaproxy",
   "partition": 1,
   "offset": 0
 },
 {
   "topic": "my_topic",
   "key": null,
   "value": "JSON Demo",
   "partition": 2,
   "offset": 0
 }
]

Get consumer offsets

To get the offsets of consumers in the group, we need to send a GET to the /consumers/{consumer_group}/instances/{consumer}/offsets endpoint with the topics and partitions:

  • Curl
  • Python

curl -s \
  -X 'GET' \
 'http://localhost:8082/consumers/my_group/instances/my_consumer/offsets' \
 -H 'accept: application/vnd.kafka.v2+json' \
 -H 'Content-Type: application/vnd.kafka.v2+json' \
 -d '{
 "partitions": [
   {
     "topic": "my_topic",
     "partition": 0
   },
   {
     "topic": "my_topic",
     "partition": 1
   },
   {
     "topic": "my_topic",
     "partition": 2
   }
 ]
}' | jq .

{
 "offsets": [
   {
     "topic": "my_topic",
     "partition": 0,
     "offset": -1,
     "metadata": ""
   },
   {
     "topic": "my_topic",
     "partition": 1,
     "offset": -1,
     "metadata": ""
   },
   {
     "topic": "my_topic",
     "partition": 2,
     "offset": -1,
     "metadata": ""
   }
 ]
}

Commit offsets

Once a consumer has handled messages, the offsets can be committed so the consumer group won't retrieve them again.

We need to POST offsets to the /consumers/{consumer_group}/instances/{consumer}/offsets endpoint:

  • Curl
  • Python

curl -s -o /dev/null -w "%{http_code}" \
  -X 'POST' \
 'http://localhost:8082/consumers/my_group/instances/my_consumer/offsets' \
 -H 'accept: application/vnd.kafka.v2+json' \
 -H 'Content-Type: application/vnd.kafka.v2+json' \
 -d '{
 "partitions": [
   {
     "topic": "my_topic",
     "partition": 0,
     "offset": 0
   },
   {
     "topic": "my_topic",
     "partition": 1,
     "offset": 0
   },
   {
     "topic": "my_topic",
     "partition": 2,
     "offset": 0
   }
 ]
}'

Remove consumer

To remove a consumer from a group, send DELETE to its base_uri:

  • Curl
  • Python

curl -s -o /dev/null -w "%{http_code}" \
  -X 'DELETE' \
 'http://localhost:8082/consumers/my_group/instances/my_consumer' \
 -H 'Content-Type: application/vnd.kafka.v2+json'

Now we can clean up:

docker stop redpanda
docker rm redpanda
docker volume remove redpanda
docker network remove redpanda

Pandaproxy Status

We'll be adding more endpoints and more encodings. For an up-to-date list of features and their status see the Pandaproxy features meta-issue on GitHub.

Pandaproxy is built on the same principles as Redpanda, but has not yet been optimized for performance. We are continuing to work on Pandaproxy, so make sure you join our Slack Community to get updates on the progress!

To learn more about Pandaproxy, read the latest in our documentation.

No items found.

Related articles

View all posts
Towfiqa Yasmeen
,
Mike Broberg
,
&
Feb 3, 2026

Redpanda Serverless now Generally Available

Zero-ops simplicity meets enterprise-grade security to unlock production-ready data streaming for builders

Read more
Text Link
Jake Cahill
,
,
&
Jan 29, 2026

Bloblang playground just got smarter

Smarter autocomplete, dynamic metadata detection, and swifter collaboration

Read more
Text Link
Peter Henn
,
,
&
Jan 6, 2026

Build a real-time lakehouse architecture with Redpanda and Databricks

One architecture for real-time and analytics workloads. Easy to access, governed, and immediately queryable

Read more
Text Link
PANDA MAIL

Stay in the loop

Subscribe to our VIP (very important panda) mailing list to pounce on the latest blogs, surprise announcements, and community events!
Opt out anytime.