Enable AI-Powered Process Optimization with ML-Ready Dataset

As companies gather and store vast amounts of data, the need for tools to turn that data into actionable insights continues to grow. There is immense potential in real-world process execution data if used correctly. According to Gartner, by 2026, organizations that create trustworthy, purpose-driven AI will see over 75% of AI innovations succeed, compared to 40% among those that don’t.

How can I use AI to optimize my business processes?

This is where machine learning and AI come in. By using machine learning algorithms, businesses can analyze their process execution data to identify patterns and trends that can help optimize their processes, reduce costs, and enhance customer experiences.

Examples of AI and machine learning in process optimization

Real-world examples include performance and load forecasting for businesses with high seasonality in their operations, such as e-commerce or tax service providers, and custom correlation analysis to focus data analysis efforts. For instance, businesses can explore the relationship between variable values such as the input channel, (e.g. how the process was started), and process performance (e.g. turnaround times), to prioritize new areas of investigation. Understanding the correlation between task duration and the overall duration of a process instance can also help you find trends in operations.

An example would be if an order isn’t packed by 11 am, it is 80% less likely to arrive to the buyer in the 3-day SLA window. Other use cases could include predicting how long a process will take given the variables, or to aid knowledge workers in making decisions using AI-powered suggestions. Since your data science team can use the process execution data as they see fit, the possibilities are only limited by your imagination and availability.

However, this is often easier said than done. Data preparation is a time-consuming process that is critical to ensure the data meets specific quality criteria. Typically, 80% of the effort in a data analysis project is spent on data preparation.

Improving our existing raw data exports

Before introducing the machine learning-ready dataset export feature, Camunda Optimize provided a raw data report that offered basic data, such as identifiers, start and end dates, and variables. Although useful, this data had its limitations.

Since the output was not optimized for ML and AI, someone had to spend time organizing and cleaning the data before ingestion, resulting in slow progress and added friction to the process.

Democratizing process data for ML and AI

Camunda Optimize’s new machine learning-ready dataset export feature helps democratize process data for ML and AI tools. The exported data is already organized and pre-processed, making it much easier to work with.

Business analysts and data scientists can use the data to train a new model or feed it into an existing model to make predictions for future process instances using the existing execution data. For example, it is possible to predict how long an instance will take to complete based on the tasks and variables in the process flow.

The dataset includes information such as the number of open incidents, the number of incidents per process instance, the number of user tasks executed, and the total duration of each flow node of a specific “task” type. Camunda Optimize also added columns to enable users to access data such as the total number of incidents per process instance, the number of open incidents, the number of user tasks, and the total duration of an event.

How to export your ML-ready dataset in Optimize

Using the machine learning-ready dataset export feature in Camunda Optimize is easy. Here’s how:

Log in to your Camunda Platform account (or sign up for a free trial) and head over to Optimize.
Create a new report, select your process, and choose “blank report.”
Under report setup, change the view to “raw data” and click save.
Create and use filters as needed with your data.
Export the dataset as CSV file.
Import the CSV file into your data analysis and manipulation tool of choice, like Panda.
Use the data to perform offline model training and analysis.

You can train a machine learning model using common libraries, such as Pandas or scikit-learn with data in CSV or JSON format. However, CSV exports are limited to 1000 rows of data, so you may need to change your filter settings before exporting. Fortunately, you can use the external Optimize endpoint to export all your data.

Start training your ML models with real process execution data

The new machine learning-ready dataset export feature in Camunda Optimize provides business analysts and data scientists with a powerful tool for analyzing their process execution data or training ML models. By providing detailed process instance data for offline model training and analysis, businesses can uncover patterns and trends that can help them optimize their processes, reduce costs, and improve customer experiences.

We’d love your feedback

As always, we’d love to hear your feedback about this feature to better understand your use case and improve it further. Reach out to us in the Camunda Forum or Slack channel to let us know what you think.

Back to the blog

Start the discussion at forum.camunda.io

How can I use AI to optimize my business processes?

Examples of AI and machine learning in process optimization

Improving our existing raw data exports

Democratizing process data for ML and AI

How to export your ML-ready dataset in Optimize

Start training your ML models with real process execution data

We’d love your feedback

Try All Features of Camunda

What Native Agentic Architecture Actually Looks Like

AWS vetted Camunda so you don't have to

ProcessOS Field Notes: What We are Hearing So Far

What Native Agentic Architecture Actually Looks Like

AWS vetted Camunda so you don't have to

ProcessOS Field Notes: What We are Hearing So Far

How can I use AI to optimize my business processes?

Examples of AI and machine learning in process optimization

Improving our existing raw data exports

Democratizing process data for ML and AI

How to export your ML-ready dataset in Optimize

Start training your ML models with real process execution data

We’d love your feedback

Try All Features of Camunda

Related Content

What Native Agentic Architecture Actually Looks Like

AWS vetted Camunda so you don't have to

ProcessOS Field Notes: What We are Hearing So Far

Related Content

What Native Agentic Architecture Actually Looks Like

AWS vetted Camunda so you don't have to

ProcessOS Field Notes: What We are Hearing So Far