Cleaning Up Historical Data

Historical data like the Camunda Audit Trail will constantly grow in your database. Use the History Cleanup feature to limit the size of your database.
Cleaning Up Historical Data

Using the History Cleanup Feature

Since version 7.7 Camunda provides a History Cleanup feature. The cleanup can be triggered manually, but is normally done automatically.

In order to run it automatically you have to switch it on in the camunda configuration as described the documentaion on periodic runs.

You also have to configure the Time-To-Live (TTL) for your process definitions, which is most often done in the BPMN model. Typical values used are in between three to six months — but are of course depending on your own business requirements. You find information on how to set the TTL in the the documentaion on History Time to Live and the extension reference. Note that if you are not configuring any TTL, the corresponding process instances will never be cleaned up by the Camunda history cleanup feature.

Cleaning up Deployments Yourself

When you have multiple versions of a deployment in production, the history cleanup might eventually delete all process, case and decision instances connected to a certain deployment version. Camunda does not delete any stale deployments automatically.

If you like to keep your process engine clean, you can easily implement a mechanism that queries and deletes deployments which are not the most recent ones and do not have remaining instances left.

Make sure that nothing refers to stale deployments, for example call activities via called element version.

Using the Java API

This code deletes old deployments, which contain only rocess definitions not used anymore. When you use decision or case definitions you have to do the according checks for them as well.

List<Deployment> deployments = repositoryService.createDeploymentQuery().list();
for (Deployment deployment : deployments) {
 boolean deploymentCanBeDeleted = true;

 List<ProcessDefinition> processDefinitions = repositoryService
   .createProcessDefinitionQuery()
   .deploymentId(deployment.getId()).list(); (1)
 for (ProcessDefinition processDefinition : processDefinitions) {
  ProcessDefinition latestProcessDefiniton = repositoryService
    .createProcessDefinitionQuery()
    .processDefinitionKey(processDefinition.getKey())
    .latestVersion().singleResult();
  boolean isLatest = latestProcessDefiniton.getId().equals(processDefinition.getId()); (2)
  boolean hasRunningInstances = runtimeService
    .createProcessInstanceQuery()
    .processDefinitionId(processDefinition.getId()).count() > 0; (3)
  boolean hasHistoricInstances = historyService
    .createHistoricProcessInstanceQuery()
    .processDefinitionId(processDefinition.getId()).count() > 0; (4)
  if (isLatest || hasRunningInstances || hasHistoricInstances) {
    deploymentCanBeDeleted = false;
    break;
  }
 }

 if (deploymentCanBeDeleted) {
  repositoryService.deleteDeployment(deployment.getId()); (5)
 }
}
1 Retrieve all process definitions from every deployment. Important Note: If you use DMN you have to check decision definitions as well!
2 Check that it is not the most recent version of the process definition…​
3 …​that there are no running instances…​
4 …​and no ended instances in the database.
5 If this is true for all definitions in the deployment it can be deleted.

Scheduling Cleanup Using a Cleanup Process

You can leverage Camunda mechanisms to manage and monitor the cleanup runs in further detail and deploy a cleanup process:

1 Any runtime exception occurring during cleanup runs should be reviewed by a system administrator later on.
2 It’s easy to deploy a process definition which gets executed in regular intervals:
<startEvent id="theStart">
    <timerEventDefinition>
        <timeCycle>0 0 3 ? * TUE-SAT</timeCycle> (1)
    </timerEventDefinition>
</startEvent>
1 The start event definition of this process will trigger a process instance at 3am in the morning of days following business days (tuesday to saturday).

In this approach you simply deploy the code shown above as Java Delegate:

public void execute(DelegateExecution delegateExecution) {
  try {
    deleteUnusedDeployments(delegateExecution);
  } catch (RuntimeException e) {
    throw new BpmnError("cleanupException", "Message: " + e.getMessage() + ", Stacktrace: " + getStackTrace(e)); (1)
  }
}

(1) You can now translate any runtime exception to a BPMN error to be handled by a System Administrator who should find a task in his tasklist the next morning.

When using this approach you have to keep in mind that the cleanup runtime is not allowed to take longer that your job execution timeout (see lockTimeInMillis in the Job Executor Configuration Properties). It is also not allowed to exceed transaction timeouts which depend on the concrete runtime environment you are in.

Using Existing Scheduling Frameworks

If you have an existing Scheduling Framework running (like EJB Timers, Quartz, …​) you can safely use this for triggering the cleanup, as you typically already have monitoring set up for it.

Bonus  Keeping a Copy of Outdated Data

Sometimes you still want to keep a copy of the data after having deleted it from Camunda. There are some approaches to achieve this:

  • BI/DWH Systems: If you export the data by means of an ETL (Extract, Transform, Load) process into an existing Data Warehouse (DWH) solution, the data will typically be kept there forever - see also Reporting About Processes.

  • Copy/Move data via SQL into another database or schema, where it will be kept.

    This is included in the SQL example of the Hamburger Sparkasse (Haspa) mentioned above.

  • Dump the database before removing data and keep the dump somewhere save.

    This is typically done if you just need to keep the data for a potential inspection, but normally never need to use it.

  • Extract the data into a format of your choice (e.g. XML) and keep it somewhere save.

No guarantee - The statements made in this publication are recommendations based on the practical experience of the authors. They are not part of Camunda’s official product documentation. Camunda cannot accept any responsibility for the accuracy or timeliness of the statements made. If examples of source code are shown, a total absence of errors in the provided source code cannot be guaranteed. Liability for any damage resulting from the application of the recommendations presented here, is excluded.

Copyright © Camunda Services GmbH - All rights reserved. The disclosure of the information presented here is only permitted with written consent of Camunda Services GmbH.