Note: The specific performance metrics in this blog post are from an earlier release of Zeebe. Since this post was published, work has been done to stabilise Zeebe clusters, and this has changed the performance envelope. You can follow the steps in this blog post to test the current release of Zeebe yourself, and derive the current performance envelope.

Zeebe advertises itself as being a “horizontally-scalable workflow engine”. In this post, we cover what that means and how to measure it.

What is horizontal scalability?

(In case you are familiar with the basic concept of horizontal scalability, you can just skip this section.)

Scalability is an answer to the question, “How can I can get my system to process a bigger workload?”

In general, there are two approaches to scalability:

  • Scaling the system vertically by beefing up the machine that runs the system by adding more processing power (CPU, memory, disk, …)
  • Scaling the system horizontally by using more (usually smaller) machines that run as a cluster

Vertical vs. Horizontal Scaling

A system is said to be horizontally scalable if adding more machines to the cluster yields a proportional increase in throughput.

The test setup

Zeebe is designed to be a horizontally scalable system. In the following sections, we’ll show you how to measure that.

In our test, we run the infrastructure on Google Kubernetes Engine (GKE) in Google Cloud:

Test Setup

There are three components:

  • Load generator: a Java program that embeds the Zeebe client and simply starts workflow instances
  • Zeebe cluster: a cluster of Zeebe brokers which we scale to different sizes to understand the relationship between cluster size and performance
  • Monitoring infrastructure: while the test is running, Prometheus is used to collect metrics from the Zeebe cluster and expose them via a dashboard in Grafana

The workload

In the test, we deploy a simple BPMN process that has a start event, service task and and end event.

Test Setup

The Load Generator then starts workflow instances by executing the following command:


Resources assigned to the brokers

The Zeebe brokers are defined as a stateful set in Kubernetes. We assign:

  • 16 GB of ram
  • 500 GB of SSD storage
  • 7 CPU cores

The results

To measure horizontal scalability, we test using the same workload while gradually increasing the number of brokers and partitions:

Brokers Partitions Workflow instances started/second (AVG)
1 8 1300
3 24 3600
6 48 5200
12 96 8200
24 192 14000

The following chart shows that Zeebe currently exhibits a nice linear scalability behavior. As we add more brokers, we observe a proportional increase in throughput, as visualized in the following chart:

Test Setup

The following is a screenshot from Grafana which shows the individual test runs:

Test Setup


This post is just a quick overview of what horizontal scalability in Zeebe means. Please note that even though Zeebe is now available as a production ready release, it is still at a stage where it has a lot of potential for performance optimization. We expect “per broker” performance to improve over time, and this is certainly a topic we’ll invest in on an ongoing basis.

“Workflow instances started per second” is just one of many performance metrics that are important to users–this certainly wasn’t a comprehensive look at Zeebe’s performance characteristics. And as is the case with any benchmark blog post, this provides just one “point in time” snapshot measuring performance of a particular version of Zeebe.

The main takeaway, and what we wanted to demonstrate here, is that we can already see that the system can scale linearly as expected when adding more brokers. Which is awesome!

When it comes to performance, we’re always curious to hear what users are seeing, too, so feel free to let us know if you have any feedback via either of the channels below.

Have a question about Zeebe? Drop by the Slack channel or Zeebe user forum.

  • Camunda Platform 8.1 Released – What’s New

    We’re extremely excited to announce the release of Camunda Platform 8.1.  In this post, we will go into the highlights of this release including Web Modeler, Connectors, Zeebe, Operate, Optimize, and more. Here’s the highlights of what’s included in this release: Easier Installation and maintenance for Camunda Platform Self-managed Hot backups for Self-Managed Token-Based authentication Modeler updates Process Templates FEEL Expression Editor Inline Annotations Changes to Camunda forms Connectors updates Workflow engine updates Operate updates Optimize updates Let’s get started! Easier installation and maintenance for Camunda Platform Self-Managed Improved Guides and Support for OpenShift and Amazon EKS (AWS) With this release, we are introducing official support for deploying and running Camunda Platform 8 Self-Managed on OpenShift and Amazon EKS (AWS)....

    Read more
  • Camunda Platform 7.18 Released – What’s New

    We’re extremely excited to announce the release of Camunda Platform 7.18 Here are some of the highlights: Cockpit: Improved instance selection to avoid unintended batch operations Cockpit: Easier navigation of high batch operation volumes MDC logging features Camunda Forms in Tasklist: Populate select options via process data Exception error codes Filter and order tasks by last updated WebSphere Liberty New and retired supported environments You can download Camunda Platform or run it with Docker. For a complete list of the changes, please check out the release notes. For patched security vulnerabilities, see our security notices. If you want to dig deeper, check out the source code on GitHub. Cockpit: Improved instance selection to avoid unintended batch operations Previously, when performing...

    Read more
  • Increase your resilience with new regions in...

    We’re excited to announce new regions for Camunda Platform 8 SaaS that further strengthen the resilience, performance, and data requirements for our SaaS customers. Enterprise customers can now choose to create their clusters in a number of regions including Europe West, US Central, US East, and the most recent addition, Australia South East. This provides multiple benefits, such as increasing availability, allowing to set up fail-safes, adhering to regulatory requirements for regional data storage, and reduced latency due to closer physical proximity. Resilience and redundancy in technology are important for all modern organizations and are top of mind for many. Across industries, organizations need a solution that scales effectively and is resilient to provide a good base for process orchestration....

    Read more

Ready to get started?

Still have questions?