New patch releases for Zeebe are available now: 0.22.4 and 0.23.3 and contain various bug fixes as well as minor enhancements. You can grab the releases via the usual channels:

Zeebe 0.23.3 is fully compatible with 0.23.2, as is 0.22.4 with 0.22.3. This means it is possible to perform a rolling upgrade for your existing clusters. For Camunda Cloud users, Zeebe 0.23.3 is already the latest production version, meaning your existing clusters should have been migrated by the time you see this post.

Without further ado, here is a list of the notable changes.

Zeebe 0.22.4

The 0.22.4 patch contains 4 bug fixes, with 3 of them also part of 0.23.3. These are the fix to allow exporting custom job headers to Elasticsearch, the NPE fix when triggering timers, and fixing a race condition when cancelling a workflow instance. You can read more about these in the 0.23.3 release notes below.

Reopen LogStream on fail over

There was a bug reported by user strawhat5, where a broker would reject all incoming requests and throw an IllegalStateException: Failed to recover broker exception on restart or fail over. This was due to a race condition and the reuse of a mutable LogStream object, resulting in log inconsistencies – that is, resulting in an inconsistent log, one of the highest severity bugs that can affect the Zeebe broker. After some investigation, this was fixed with the following PR (later back ported to 0.22).

Zeebe 0.23.3

Export custom job headers to Elasticsearch

While it’s always been possible to export custom job headers to Elasticsearch, there was an issue with the index template that was previously used, resulting in an error for some specific values (specifically when the header contained a period in its name), as reported by user eetay. The fix implemented was to disable indexing custom headers – the headers will still be stored in Elasticsearch, but not searchable via queries.

Uncaught NullPointerException on event handling

Here we see the result of improving the Log4J2 Stackdriver layout – this bug was found thanks to it, allowing the development team to use Google Cloud’s Error Reporting tool to pre-emptively find and fix bugs.

This bug was the result of a race condition, when a timer was triggered as the element it referred to was left during process execution, but before the timer could be cancelled.

Race condition resulting in stuck or hanging workflow instances

Here there were actually two bugs which had the same cause: #4400 and #4352. Both were caused by a race condition when cancelling/interrupting running workflow instances, which would result in stuck instances that could not make progress nor be cancelled, resulting in “garbage” data and resource usage. Or, as our developer saig0 put it:

On terminating the sub-process, a new token is spawned for the sequence flow that is waiting on the joining parallel gateway. As a result, the flow scope of the sub-process can not be completed or terminated.

The fix here was to publish only deferred boundary events when the sub-process is terminated to avoid that other events are published.

Improve Stackdriver Log4J2 layout to make it spec compliant

We’re aware that this is not a bug fix, and normally this wouldn’t be included in a patch release. However, in this case we decided to include it, as it brings better integration with Stackdriver related tools, such as Error Reporting. We run most of our tests and benchmarks on Google Cloud, and as such good integration with Stackdriver tools increases our ability to diagnose issues, both retroactively but also preemptively.

The enhancement does not include all possible features (e.g. adding custom labels or operation grouping), but focuses on making sure that errors were properly reported. So if you’re using the layout, be aware that it has now changed a little:

  • It is now called simply StackdriverLayout, as opposed to StackdriverJSONLayout
  • While not garbage free in the Log4J2 sense, it tries to be as close as possible to it
  • You can configure a service context so your errors are properly grouped in Error Reporting:
    • The service name can be configured via the system property log.stackdriver.serviceName, or the environment variable ZEEBE_LOG_STACKDRIVER_SERVICENAME. If omitted, this defaults to “zeebe”.
    • The service version can be configured via the system property log.stackdriver.serviceVersion, or the environment variable ZEEBE_LOG_STACKDRIVER_SERVICEVERSION. If omitted, this defaults to “development”.

Do not print Elasticsearch credentials in the console log

When starting the Zeebe broker or standalone gateway, one of the first things we do is print out the effective configuration if the log level is DEBUG or lower. One issue with this is that it would print out the Elasticsearch credentials, which is sensitive information.

The fix in the end was a broader one, where any configuration field called “username” or “password” will now be printed out as 3 asterisks (e.g. “***”).

Get In Touch

There are a number of ways to get in touch with the Zeebe community to ask questions and give us feedback.

We hope to hear from you!

  • Camunda Platform 8.1 Released – What’s New

    We’re extremely excited to announce the release of Camunda Platform 8.1.  In this post, we will go into the highlights of this release including Web Modeler, Connectors, Zeebe, Operate, Optimize, and more. Here’s the highlights of what’s included in this release: Easier Installation and maintenance for Camunda Platform Self-managed Hot backups for Self-Managed Token-Based authentication Modeler updates Process Templates FEEL Expression Editor Inline Annotations Changes to Camunda forms Connectors updates Workflow engine updates Operate updates Optimize updates Let’s get started! Easier installation and maintenance for Camunda Platform Self-Managed Improved Guides and Support for OpenShift and Amazon EKS (AWS) With this release, we are introducing official support for deploying and running Camunda Platform 8 Self-Managed on OpenShift and Amazon EKS (AWS)....

    Read more
  • Camunda Platform 7.18 Released – What’s New

    We’re extremely excited to announce the release of Camunda Platform 7.18 Here are some of the highlights: Cockpit: Improved instance selection to avoid unintended batch operations Cockpit: Easier navigation of high batch operation volumes MDC logging features Camunda Forms in Tasklist: Populate select options via process data Exception error codes Filter and order tasks by last updated WebSphere Liberty New and retired supported environments You can download Camunda Platform or run it with Docker. For a complete list of the changes, please check out the release notes. For patched security vulnerabilities, see our security notices. If you want to dig deeper, check out the source code on GitHub. Cockpit: Improved instance selection to avoid unintended batch operations Previously, when performing...

    Read more
  • Increase your resilience with new regions in...

    We’re excited to announce new regions for Camunda Platform 8 SaaS that further strengthen the resilience, performance, and data requirements for our SaaS customers. Enterprise customers can now choose to create their clusters in a number of regions including Europe West, US Central, US East, and the most recent addition, Australia South East. This provides multiple benefits, such as increasing availability, allowing to set up fail-safes, adhering to regulatory requirements for regional data storage, and reduced latency due to closer physical proximity. Resilience and redundancy in technology are important for all modern organizations and are top of mind for many. Across industries, organizations need a solution that scales effectively and is resilient to provide a good base for process orchestration....

    Read more

Ready to get started?

Still have questions?