Definitions

A Glossary for Modern Data and AI Teams

Topic
Topics
Showing 0 of 0 items
Sort by date
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Glossary Terms

Welcome to our glossary of the terms and concepts behind modern data systems—from event streaming and real-time pipelines to the broader data platform ecosystem. We’ll continue expanding it to include agentic data plane and AI/ML concepts as the space evolves. Use this as a quick reference, and explore our resources, blog, docs, or Developer page for more.

Access Control List (ACL)

Security & Access Control

A security feature used to define and enforce granular permissions to resources.

Admin API

Networking & APIs

A REST API used to manage and monitor Redpanda Self-Managed clusters. It uses the default port 9644.

Advertised Listener

Core Streaming Platforms

The address a Redpanda broker broadcasts to producers, consumers, and other brokers.

Authentication

Security & Access Control

The process of verifying the identity of a principal, user, or service account.

Authorization

Security & Access Control

The process of specifying access rights to resources.

Availability Zone

Operations & Maintenance

One or more data centers served by high-bandwidth links with low latency, typically within a close distance of one another.

Bearer Token

Security & Access Control

An access token used for authentication and authorization in web applications and APIs.

Beta

Cloud & External Integrations

Features in beta are available for testing and feedback. They are not supported by Redpanda and should not be used in production environments.

Broker

Cluster & Node Operations

An instance of Redpanda that stores and manages event streams. Multiple brokers join together to form a Redpanda cluster.

BYOC

Business & Commercial

Bring Your Own Cloud (BYOC) is a fully-managed Redpanda Cloud deployment where clusters run in your private cloud, so all data is contained in your own environment. Redpanda handles provisioning, operations, and maintenance.

BYOVNet

Business & Commercial

A Bring Your Own Virtual Network (BYOVNet) cluster allows you to deploy the Redpanda data plane into your existing Azure VNet to fully manage the networking lifecycle. Compared to standard BYOC, BYOVNet provides more security, but the configuration is more complex.

BYOVPC

Business & Commercial

A Bring Your Own Virtual Private Cloud (BYOVPC) cluster allows you to deploy the Redpanda data plane into your existing VPC on AWS or GCP to fully manage the networking lifecycle. Compared to standard BYOC, BYOVPC provides more security, but the configuration is more complex.

Cert-Manager

Security & Access Control

A Kubernetes controller that simplifies the process of obtaining, renewing, and using certificates.

Client

Message & Stream Management

A producer application that writes events to Redpanda, or a consumer application that reads events from Redpanda.

Cluster

Cluster & Node Operations

One or more brokers that work together to manage real-time data streaming, processing, and storage.

Compaction

Time & Data Lifecycle

Feature that retains the latest value for each key within a partition while discarding older values.

Connector

Data Processing & Transformation

Enables Redpanda to integrate with external systems, such as databases.

Consumer

Message & Stream Management

A client application that subscribes to Redpanda topics to asynchronously read events.

Consumer Group

Message & Stream Management

A set of consumers that cooperate to read data for better scalability.

Consumer Offset

Message & Stream Management

The position of a consumer in a specific topic partition, to track which records they have read.

Controller Broker

Cluster & Node Operations

A broker that manages operational metadata for a Redpanda cluster and ensures replicas are distributed among brokers.

Controller Snapshot

Cluster & Node Operations

Snapshot of the current cluster metadata state saved to disk, so broker startup is fast.

Control Plane

Cloud & External Integrations

This part of Redpanda Cloud enforces rules in the data plane, including cluster management, operations, and maintenance.

Data Plane

Cloud & External Integrations

This part of Redpanda Cloud contains Redpanda clusters and other components, such as Redpanda Console, Redpanda Operator, and rpk. It is managed by an agent that receives cluster specifications from the control plane.

Data Sovereignty

Business & Commercial

Containing all your data in your environment. With BYOC, Redpanda handles provisioning, monitoring, and upgrades, but you manage your streaming data without Redpanda's control plane ever seeing it.

Data Stream

Data Processing & Transformation

A continuous flow of events in real time that are produced and consumed by client applications. Also known as event stream.

Data Transforms

Data Processing & Transformation

Framework to manipulate or enrich data written to Redpanda topics using WebAssembly (Wasm).

Dedicated Cloud

Cloud & External Integrations

A fully-managed Redpanda Cloud deployment option where you host your data in Redpanda's VPC, and Redpanda handles provisioning, operations, and maintenance. Dedicated clusters are single-tenant deployments that support private networking.

Event

Data Processing & Transformation

A record of something changing state at a specific time.

HTTP Proxy

Networking & APIs

Redpanda HTTP Proxy (pandaproxy) allows access to your data through a REST API. Uses default port 8082.

Kafka API

Core Streaming Platforms

Producers and consumers interact with Redpanda using the Kafka API. It uses the default port 9092.

Learner

Cluster & Node Operations

A broker that is a follower in a Raft group but is not part of quorum.

Limited Availability

Cloud & External Integrations

Features in limited availability (LA) are production-ready and are covered by Redpanda Support for early adopters.

Listener

Core Streaming Platforms

Configuration on a broker that defines how it should accept client or inter-broker connections.

Maintenance Mode

Cluster & Node Operations

A state where a Redpanda broker temporarily doesn't take any partition leaderships.

MCP tool

AI/ML & Agentic Systems

A function that an AI assistant can call to perform a specific task, such as fetching data from an API, querying a database, or processing streaming data.

Message

Message & Stream Management

One or more records representing individual events being transmitted.

Node

Cluster & Node Operations

A machine, which could be a server, a virtual machine (instance), or a Docker container.

Offset

Message & Stream Management

A unique integer assigned to each record to show its location in the partition.

Offset Commit

Message & Stream Management

An acknowledgement that the event has been read.

Pandaproxy

Networking & APIs

Original name for the subsystem of Redpanda that allows access to your data through a REST API.

Partition

Message & Stream Management

A subset of events in a topic, like a log file. It is an ordered, immutable sequence of records.

Partition Leader

Cluster & Node Operations

Every Redpanda partition forms a Raft group with a single elected leader. This leader handles all writes.

Pipeline

Data Processing & Transformation

A single configuration file running in Redpanda Connect with an input connector, an output connector, and optional processors in between.

Principal

Security & Access Control

An entity (such as a user account or a service account) that accesses resources.

Producer

Message & Stream Management

A client application that writes events to Redpanda.

Rack

Operations & Maintenance

A failure zone that has one or more Redpanda brokers assigned to it.

Rack Awareness

Operations & Maintenance

Feature that lets you distribute replicas of the same partition across different racks.

Raft

Cluster & Node Operations

The consensus algorithm Redpanda uses to coordinate writing data to log files and replicating that data across brokers.

RBAC

Security & Access Control

Role-based access control lets you assign users access to specific resources.

Rebalancing

Cluster & Node Operations

Process of moving partition replicas and transferring partition leadership for improved performance.

Record

Message & Stream Management

A self-contained data entity with a defined structure, representing a single event.

Redpanda Cloud

Cloud & External Integrations

A fully-managed data streaming service deployed with Redpanda Console. It includes automated upgrades and patching, backup and recovery, data and partition balancing, and built-in connectors.

Redpanda Community Edition

Business & Commercial

Redpanda software that is available under the Redpanda Business Source License (BSL). These core features are free and source-available.

Redpanda Connect

Core Streaming Platforms

A framework for building data streaming applications using declarative YAML configurations.

Redpanda Connect MCP server

AI/ML & Agentic Systems

A process that exposes Redpanda Connect components to MCP clients.

Redpanda Console

Core Streaming Platforms

The web-based UI for managing and monitoring Redpanda clusters and streaming workloads.

Redpanda Enterprise Edition

Business & Commercial

Redpanda software that is available under the Redpanda Community License (RCL). Includes enterprise features like Tiered Storage.

Redpanda Helm chart

Infrastructure & Deployment

Generates and applies all the manifest files you need for deploying Redpanda in Kubernetes.

Redpanda Operator

Infrastructure & Deployment

Extends Kubernetes with custom resource definitions (CRDs), which allow Redpanda clusters to be treated as native Kubernetes resources.

Remote MCP

AI/ML & Agentic Systems

An MCP server hosted in your Redpanda Cloud cluster. It exposes custom tools that AI assistants can call to access your data and workflows.

Remote Read Replica

Cluster & Node Operations

A read-only topic that mirrors a topic on a different cluster, using data from Tiered Storage.

Replicas

Cluster & Node Operations

Copies of partitions that are distributed across different brokers.

Replication Factor

Cluster & Node Operations

The number of partition copies in a cluster.

Resource Group

Cloud & External Integrations

A container for Redpanda Cloud resources, including clusters and networks.

Retention

Time & Data Lifecycle

The mechanism for determining how long Redpanda stores data on local disk or in object storage before purging it.

Rolling Upgrade

Operations & Maintenance

The process of upgrading each broker in a Redpanda cluster, one at a time.

Rpk

Core Streaming Platforms

Redpanda's command-line interface tool for managing Redpanda clusters.

Schema

Data Formats & Schemas

An external mechanism to describe the structure of data and its encoding.

Schema Registry

Data Formats & Schemas

Redpanda Schema Registry is the interface for storing and managing event schemas. Uses default port 8081.

Seastar

Performance & Resource Management

An open-source thread-per-core C++ framework, which binds all work to physical cores. Redpanda is built on Seastar.

Seed Server

Cluster & Node Operations

The initial set of brokers that a Redpanda broker contacts to join the cluster.

Segment

Time & Data Lifecycle

Discrete part of a partition, used to break down a continuous stream into manageable chunks.

Self-Managed

Business & Commercial

Redpanda Self-Managed refers to the product offering that includes both the Enterprise Edition and the Community Edition.

Serialization

Data Formats & Schemas

The process of converting a record into a format that can be stored.

Serverless

Cloud & External Integrations

Serverless is the fastest and easiest way to start data streaming. You host your data in Redpanda's VPC, and Redpanda handles automatic scaling, provisioning, operations, and maintenance.

Service Account

Security & Access Control

An identity independent of the user who created it that can be used to authenticate and perform operations.

Shard

Performance & Resource Management

A CPU core.

Sink Connector

Data Processing & Transformation

Exports data from a Redpanda cluster into a target system.

Source Connector

Data Processing & Transformation

Imports data from a source system into a Redpanda cluster.

Subject

Data Formats & Schemas

A logical grouping or category for schemas.

Thread-Per-Core

Performance & Resource Management

Programming model that allows Redpanda to pin each of its application threads to a CPU core.

Tiered Storage

Cloud & External Integrations

Feature that lets you offload log segments to object storage in near real-time.

Topic

Message & Stream Management

A logical stream of related events that are written to the same log.

Topic Partition

Message & Stream Management

A topic may be partitioned through multiple brokers.

Red panda sitting on skateboard holding bamboo leaf in garden
Try Now

Try Redpanda. We won’t bite.

You'll find it's easy to install and simple to get up and running with our lightening-fast streaming data platform.