Resolving Process Incidents and Exceptions with Operate

Did you know that you can keep your process instances moving with Operate? Learn how in this tutorial.
By
Resolving Process Incidents and Exceptions with Operate
  • Blog
  • >
  • Resolving Process Incidents and Exceptions with Operate
TOPICS

30 Day Free Trial

Bring together legacy systems, RPA bots, microservices and more with Camunda

Sign Up for Camunda Content

Get the latest on Camunda features, events, top trends, and more.

TRENDING CONTENT

When you are developing complex process orchestrations and automations, there may be paths or unexpected errors that have not been tested. If your process is lengthy and complex, restarting every process to test a small section can be tedious and time consuming as can extracting those steps into standalone process chunks. But you don’t have to do that with Operate.

In fact, Operate allows you to add variables, set existing variables, and move your process to previous or future tasks as well. With Operate, you have many options to keep your business moving by resolving incidents and events for process instances in flight.

Incidents happen

In any well-thought-out process, it is hard to account for every possible error, external system, branch, or variable even with the best-laid plans. You may realize that you misspelled a variable name or did not properly parse the JSON response from a Connector call which can trigger an incident. But viewing process instance information with Operate is easy and offers ways to resolve these incidents.

When developing a process, it is crucial to test the end-to-end process in depth to confirm that it works as expected. When you have a highly orchestrated process with many different connectors, branches and tasks, this testing can be time consuming and lengthy. You might successfully test 80% of your process to find that the last task has some unforeseen issue. To restart from the beginning repeating preceding tasks just to focus on the final task is not optimal.

What happens if you are already in production with a process? Incidents can still occur. For example, an external service might go down. The process instance may exhaust the dictated number of retries before the external process is up and running again. This will generate an incident and error message.

How can Operate help?

In Operate, you can see the process state of your instance as well as all current instance variables and values and any possible incidents.

Process Instance displayed in Operate depicting incident at a task.

Incidents are designated by the red marker and the notification at the top of the screen. Initial inspection of the incident in Operate shows that a connection to an external source was denied due to an expired authorization token.

Reviewing provided incident information.
Viewing the exact error returned by the failed task.

The incident information can be reviewed to narrow down the cause and determine the appropriate fix required. In this example, the incident took place within the Claim Adjuster Process subprocess. Next is to confirm exactly what happened by finding the process instance for the Claim Adjuster Process reflecting an incident.

When this process is opened in Operate, the incident occurred on the “generating the Google document.” Additional information about the incident is also available in this view.

Displaying the incident task with details about the incident - in this case at the Google document generation task.

The ability to make in-flight changes to running process instances maintains the process history and data which is an invaluable component of your process maturity model.

How to keep the process moving

After inspecting the error messaging and recognizing this external service is down or something is amiss, a fix can be made to get that service back online. In this case, you need to update a value for the Connector Secret in the Zeebe cluster.

Retry the task

Now that the secret has been updated and you simply need to click the retry icon within the process instance (in this case the subprocess for Claim Adjuster Flow). This will execute a retry on that specific activity which was stopped by the incident.

Selecting the retry option in Operate.

Alternatively, if more than one process was stopped, a batch retry can be executed and all process incidents that were stopped will receive the retry.

In either case, once that retry for the task is complete, the process instances will continue.

Resulting process after successful retry of a incident at a task.

Modifying your process instance

There are exceptions to every rule and this could happen in your process implementation. An exception might be required in your existing process with circumstances that were unforeseen or something that just doesn’t fit the implementation. However, maintaining the history of the process instance remains important as does its completion.  

Operate also provides the ability to modify an active process instance to allow for the execution to continue. This feature allows users to move the process forwards and backwards in a given workflow as well as cancel execution or add variables to running process instances.

Entering modification mode is straightforward by just selecting the appropriate icon for modifying an instance. Some help is presented to give guidance before you make alterations to the process.

Process Instance Modification mode preliminary dialog.

From here, you can elect to move your process instance token to a task forward in the process or to re-execute previous tasks.

Selecting the modify a process instance with a move

Selection of the previous tasks to execute or the subsequent task is as simple as a mouse click and the instance is modified.

Operate-process-instance-modification

This feature is often overlooked when reviewing instances in Operate, but can serve a very important purpose when handling exceptions or stopped processes.

Let Operate keep your business processes running smoothly

When working with complex process orchestrations, process instances can be long-lived with a great many tasks. If a process is already in flight, there is a rich set of data around the process tasks, interaction with other systems, time between and the duration of tasks that are part of that instance history. This is a valuable part of process governance and the loss of this information with the restart of a process instance from the beginning causes damage to the intelligence being gathered with each instance. Having the capability to make these in flight changes while maintaining the process history is imperative as your organization grows its process maturity model.

Learn more about Operate and get started with a free trial of Camunda today.

Start the discussion at forum.camunda.io

Try All Features of Camunda

Related Content

Useful process intelligence is important for evaluating whether your processes are putting you on track for your goals. Learn how Camunda Optimize and Process KPIs can help.
We're excited to announce the May 2024 alpha release of Camunda. Check out what's new, from AI Connectors to BPMN improvements to enhanced security.
Level up your workflows with AI suggestions for increased efficiency from Camunda Copilot.