Here is a great example of how Zeebe and BPMN can be used to solve real-world business problems – in this case, in Minecraft!

Ultima VI recreated in Minecraft
Ultima VI recreated in Minecraft

The title of this blog post is a nod to the book Real-Life BPMN: Using BPMN 2.0 to Analyze, Improve, and Automate Processes in Your Company, the Camunda “Bible of BPMN”; and the irony of using “Real World BPMN” with the virtual world of Minecraft. Zeebe and BPMN work everywhere that stateful processes evolve over time!

Sections:

Background

One of my hobbies is teaching kids to code. For the last couple of years Tim Marwick and I have been developing a hobby project called MagicCraft – coding JavaScript in Minecraft.

I stumbled onto Minecraft while volunteering as a mentor at CoderDojo on weekends. Kids didn’t want another day of school – they wanted to play games, and play with other kids – and they wanted to play Minecraft. I found ScriptCraft, an open source JavaScript API for Minecraft programming, and discovered that the kids were very motivated to learn to code when it meant blowing things up, throwing other players in the air, and killing zombies with lightning bolts.

Kids using laptops at CoderDojo

Kids love to play Minecraft, and transferring that engagement to learning to code is a Good Thing(tm)

We eventually built what is essentially a JavaScript application server in Minecraft.

I’d been thinking for a while about how Zeebe could be used with Minecraft, and over the weekend I had a great opportunity to both get it working with Minecraft, and solve a real-world business problem in a way that showcases the real power of Zeebe and BPMN to increase business agility.

The Problem

With Magikcraft, players code TypeScript or JavaScript in a web browser, then connect to the Minecraft server, where their code is synced (via GraphQL) and is available for them to execute. It’s a fast edit-compile-run cycle that produces immediate feedback (and gratification!).

 

<script src="https://fast.wistia.com/assets/external/E-v1.js" async></script>
<div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;">
    <div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;">
        <div class="wistia_embed wistia_async_9d2ggrh6du videoFoam=true" style="height:100%;width:100%">&nbsp;</div>
    </div>
</div>
<!-- <img src="https://camunda.com/wp-content/uploads/camunda/zeebe-imagesimages/pic10.jpg" alt="" /> -->

 

screenshot of code in Magikcraft

Here’s the problem:

To use Minecraft you need a Minecraft user account from Mojang (now owned by Microsoft), and to use Magikcraft you need a Magikcraft account.

Magikcraft login
Minecraft login

At some point we need to associate the two accounts, so that we know which code to sync to the server for a user.

The problem I was looking at is a breakdown in associating the two accounts and syncing the code.

If the user has associated their two accounts at some point in the past, then as soon as they connect to the server the syncing starts – Magic!

If they haven’t associated their Minecraft account, however, then they are prompted to do so. The breakdown is that syncing doesn’t start when they associate their accounts – they need to disconnect from the server and connect again for the association to be detected and code syncing to start.

There are several aspects here that make this a great example of the problem space that Zeebe and BPMN address.

  • Firstly: it’s a business process issue – in this case one that manifests directly as user experience.
  • Secondly, it’s due to the integration of several systems. In these cases there is no obvious single place where responsibility lies, no single service to fix it in. It’s precisely the orchestration of the interaction between discrete services – the Minecraft login and the Magikcraft login.
  • It also spans several services: users must log in to Minecraft in their Minecraft client, Magikcraft in their web browser, and we rely on them clicking a link in Minecraft and opening that in their web browser to perform the association – then we need to communicate that back to Minecraft, and our services need to modify their behavior based on the state of the business process.
  • And finally, it represents “lost business” – the drop-off at this stage is high. Very few unattended users make it through.

Conceptually, the current behavior of the system looks like the following (click the image to embiggen it):

The current system behavior, conceptually

Well, when you put it like that – it’s obvious! Just add a GOTO statement in there to go from after the connection message to the Retrieve Spells subprocess!

Except the system behaves like this, but it isn’t implemented like this. The business process diagrammed there is an emergent feature of the interaction of the components. This is the system behavior crystallized in a diagram. The actual implementation starts from a technical implementation of each of the pieces, and is then coupled together through a variety of mechanisms and conventions in the code.

So I spent a day digging into the code to see if there was an obvious and easy way to fix it in the existing code base.

Where do you put process state in a system?

It’s not a technical problem with the components themselves. All the pieces are in place, and each does what it is supposed to – working REST API and calls to generate a JWT for the user, a GraphQL endpoint to get their code. However, they just aren’t working together in the way that we need them to.

As I looked through the code I ran into a real problem: the orchestration of the business process was coupled with the technical implementation of the services. It was not obvious how to safely change the orchestration without introducing unintended side-effects to the existing behavior.

In this case, it seemed like I had to choose between two evils:

  1. Refactor the code to make it to a concrete implementation of the business process that I want to implement right now; or
  2. Make components configurable to deal with business process state in addition to their own internal state domain.

The first option sacrifices future flexibility for current optimization, and the second option sacrifices clarity and simplicity for future flexibility.

Can I just vote for Cthulhu and be done with it?

Why vote for the lesser evil?

Mixing the business process with the services and components that implement it – spreading it everywhere and having it nowhere – makes code complex, difficult to reason about, and hard to modify safely in response to changing business requirements (choose two).

I wished for a high-level DSL – one where I could express the business process in a single comprehensible file, drying it out of the components that implement it – and somewhere to store the business process state.

Maybe a higher order component to represent the business process itself was the answer? Any programming problem can be solved by another (appropriate!) level of abstraction, amirite?

Enter Zeebe

Well, that’s precisely what Zeebe and BPMN (Business Process Modelling Notation) are.

BPMN is a DSL for representing business processes, and Zeebe is a “workflow engine for orchestrating microservices” – a state machine for both holding the state of the business process and executing the orchestration of the business process encoded in the BPMN. Zeebe delegates the technical implementation of the business process to stateless microservices that operate on the state that Zeebe passes to them.

This allows a clean separation of concerns: a configurable, stateful component (Zeebe + BPMN) to represent the business process, and decoupled microservices that contain domain objects that concern themselves with things like REST APIs, filesystems, and JWTs, and operate purely on state passed in through an explicit interface.

Edgar Dijkstra put it like this in his seminal paper GOTO Considered Harmful:

(O)ur intellectual powers are rather geared to master static relations and our powers to visualize processes evolving in time are relatively poorly developed. For that reason we should do (as wise programmers aware of our limitations) our utmost best to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.

What we have with BPMN is the shortest possible conceptual gap between the process and its representation over time, combined with code that can be largely stateless and functional – concerned with static relations, the kind of code that humans can best reason about. We can get back the power of GOTO when we move it outside our code, to a realm where it maps cleanly on to the domain: like assembly language that describes the sequential steps taken by a CPU to execute the logic of our structured program over time, or BPMN that represent the logical flow of a stateful process over time.

Tackling this problem with Zeebe

The first thing I did was model the desired system behavior in the Zeebe Modeler. It’s an Electron app that allows you to graphically model BPMN workflows. After several iterations, I got this (click to embiggen):

The desired system behavior, concretely

I toyed with the idea of putting Spell Syncing into its own process diagram, but settled on this as the best way to have the whole system behavior visible in one view.

Event-driven via Messaging

There are three message events in this diagram that represent user actions:

The start message

The first, circled in the diagram above, is a message start event. This is an incoming message that starts an instance of this workflow, and is initiated when a user joins the Minecraft server. I need to write an event handler for that event in Minecraft and publish a message to Zeebe. We’ll look at how to do that shortly.

The intermediate message catch event

The second is an intermediate message catch – the “User typed /spellbook“. This is reached if a user joins and their Minecraft username is not associated with a Magikcraft account. In this case the business process sits here until the user triggers the association flow from within Minecraft. So in that case, I need to publish a message to Zeebe when they do that. The second and third message events use Zeebe’s message correlation feature to target a running workflow instance with state updates. This message needs to be published with a correlation key to target the specific running instance of the workflow that was started with the start message.

The desired system behavior, concretely

The final one is an interrupting boundary message event. This is triggered when a user quits the Minecraft server. It will interrupt the business process at any stage of execution and trigger the clean-up. It is also correlated to the running workflow instance using a correlation key.

Implementing Zeebe messages in Minecraft

I had two paths that I could see to take to implement Zeebe message publishing in Minecraft.

The first would be to build a Minecraft plugin to make the Zeebe Java client available inside the Minecraft server. This would then involve writing an integration wrapper in JavaScript to make it easily accessible from the JS code that we use in Minecraft. I decided against this approach for my POC because I could see a shorter pathway to low orbit.

The second approach would be to use a Node sidecar container with a REST API and ScriptCraft’s existing http module. The Node sidecar would expose a single general-purpose route: publishZeebeMessage. It would consume a message for Zeebe from Minecraft, and then use the existing Zeebe JS client to publish the message.

This turned out to be trickier than I had anticipated (budget for unknown unknowns much?). ScriptCraft’s latest release has a zero-width space character in the http module that breaks it. Once I isolated that and patched it, I was good to go.

Now, I needed to marshall the message from Minecraft to Node over REST. ScriptCraft sends POST requests as application/x-www-form-urlencoded, so I JSON.stringified the message in Minecraft, then JSON parsed it in Node, like this:

Minecraft:

http.request({
    method: "POST",
    params: { message: JSON.stringify(message) },
    url
}, callback)

Node:

app.use(`/publishZeebeMessage`, (req, res) => {
    const message = JSON.parse(req.body.message);
    console.log(`Publishing message ${message.name} to Zeebe`);

    const payload: PublishMessageRequest<any> = {
        messageId: message.messageId || uuid(),
        name: message.name,
        timeToLive: message.timeToLive || 10000,
        correlationKey: message.correlationKey,
        variables: message.variables || {}
    };
    zbc.publishMessage(payload);
    sendResponseOK(res);
});

Any problem can be solved by another level of indirection and enough JSON.stringify(JSON.parse()), amirite? One of the Zeebe engineers (jokingly) suggested using JSONx to solve my problem. In case you haven’t heard of it, “JSONx is an IBM® standard format to represent JSON as XML.

One JBoss engineer I used to work with had as his email sig: “XML is like violence. If it doesn’t solve your problem, you aren’t using enough of it

In that case, call me a committed pacifist.

Event Handlers in Minecraft

The event handlers in Minecraft were easy. I wrote a publishZeebeMessage function, and then added this to the autoload of my Minecraft JS module. This code is transpiled from TypeScript to ES5, and is executed in the Nashorn JavaScript engine, inside the Minecraft JVM:

import * as events from "events";
import { MessageName, publishZeebeMessage } from "../zeebe";

events.playerJoin(({ player }) => {
    console.log(player.name + " joined the server.");
    // Publish start message
    publishZeebeMessage({
        name: MessageName.USER_JOINED_MINECRAFT_SERVER,
        variables: {
            minecraftUsername: player.name
        }
    });
});

events.playerQuit(({ player }) => {
    console.log(player.name + " quit the server");
    // Publish message for boundary event
    publishZeebeMessage({
        name: MessageName.USER_QUIT_MINECRAFT_SERVER,
        correlationKey: player.name,
        timeToLive: 1000
    });
});

console.log("Loaded player join and quit handlers");

The MessageName enum is provided by TypeScript types that the Node Zeebe client can generate from BPMN files, like this:

async function deployBpmn(zbc: ZBClient) {
    const bpmnFile = "./bpmn/player-join-server.bpmn";
    await zbc.deployWorkflow(bpmnFile);
    console.log(
        await BpmnParser.generateConstantsForBpmnFiles(bpmnFile)
    );
}

Since message names are strings, I want to avoid tracking down bugs introduced by “spelling mistakes at midnight”. Having auto-generated enums for Zeebe message names reduces that surface area.

Wrap-up

Extracting the business process out of your code allows you write domain objects that concern themselves purely with the state of their domain (which you should be doing as a best practice anyway, but you know: deadlines).

You can get a clean separation of concerns between the domain and the business process by using a workflow engine to manage the orchestration of the business process orchestration and its current state. Putting your business process into BPMN gives you a graphical representation of your business process over time, and one that is executable and guaranteed to be up-to-date. Just as types are micro-tests and code documentation that is always up to date, BPMN provides you with business process documentation that is always up-to-date.

You still have to manage the coupling between your microservices and the workflow engine, but this can be managed in the interface of the task handler. The JavaScript library allows you to type the payload of Zeebe workers using TypeScript generics.

All-in-all, I found implementing this feature using Zeebe to be fast, and fun! It brought a lot of clarity to the code base, and vastly simplified it.

Structuring a Zeebe Project and Managing Remaining State: A Teaser

The amount of code in the Zeebe solution is minimal compared to the original implementation. My monolithic application – freely mixing orchestration, domain logic, and business process state – has been decomposed into a set of microservices that implement domain operations, and a business process model, with Zeebe orchestrating it.

I’m still DRYing it out, and along the way I’m discovering some patterns that work, inspired by this video “Mastering Chaos – A Netflix Guide to Microservices“. It’s worth watching the whole thing, but the one below should start at 11:38, where Josh Evans talks about how microservices are an abstraction, and you still have to deal with state over time in your “stateless” microservices, in the form of caching:

I don’t want to again collapse state and time with logic, so keeping a strong separation of concerns in the code is still important there – as is the question of how you structure a Zeebe project in terms of the code. These are discoveries that I will share with you in a future blog post. That’s enough for now.

Hopefully this gives you a glimpse of the wide domain-applicability of Zeebe, and a sense of its power and promise to make programming fun and productive again!

Have a cool project that you’ve built with Zeebe? Drop me a line at josh.wulf@camunda.com. I’d love to hear about it!

Start modelling business process workflows using BPMN

  • Camunda Platform 8.1 Released – What’s New

    We’re extremely excited to announce the release of Camunda Platform 8.1.  In this post, we will go into the highlights of this release including Web Modeler, Connectors, Zeebe, Operate, Optimize, and more. Here’s the highlights of what’s included in this release: Easier Installation and maintenance for Camunda Platform Self-managed Hot backups for Self-Managed Token-Based authentication Modeler updates Process Templates FEEL Expression Editor Inline Annotations Changes to Camunda forms Connectors updates Workflow engine updates Operate updates Optimize updates Let’s get started! Easier installation and maintenance for Camunda Platform Self-Managed Improved Guides and Support for OpenShift and Amazon EKS (AWS) With this release, we are introducing official support for deploying and running Camunda Platform 8 Self-Managed on OpenShift and Amazon EKS (AWS)....

    Read more
  • Camunda Platform 7.18 Released – What’s New

    We’re extremely excited to announce the release of Camunda Platform 7.18 Here are some of the highlights: Cockpit: Improved instance selection to avoid unintended batch operations Cockpit: Easier navigation of high batch operation volumes MDC logging features Camunda Forms in Tasklist: Populate select options via process data Exception error codes Filter and order tasks by last updated WebSphere Liberty New and retired supported environments You can download Camunda Platform or run it with Docker. For a complete list of the changes, please check out the release notes. For patched security vulnerabilities, see our security notices. If you want to dig deeper, check out the source code on GitHub. Cockpit: Improved instance selection to avoid unintended batch operations Previously, when performing...

    Read more
  • Increase your resilience with new regions in...

    We’re excited to announce new regions for Camunda Platform 8 SaaS that further strengthen the resilience, performance, and data requirements for our SaaS customers. Enterprise customers can now choose to create their clusters in a number of regions including Europe West, US Central, US East, and the most recent addition, Australia South East. This provides multiple benefits, such as increasing availability, allowing to set up fail-safes, adhering to regulatory requirements for regional data storage, and reduced latency due to closer physical proximity. Resilience and redundancy in technology are important for all modern organizations and are top of mind for many. Across industries, organizations need a solution that scales effectively and is resilient to provide a good base for process orchestration....

    Read more

Ready to get started?

Still have questions?