Perform root cause analysis

As a process manager or analyst, when a KPI target is not met, we need to understand which cases are more likely to violate the KPI. This information can help us identify the causes of KPI violations and determine the necessary corrective actions.

To help in such investigations, the Root Cause Analyzer of the KPI Center displays a list of conditions under which the risk of KPI violation is considerably higher when this condition holds, compared to when the condition does not hold.

For example, given a KPI that states that at least 95% of cases should have a duration below 12 days, the Root Cause Analyzer generates a list of statements such as the following:

  • The KPI violation rate when “Invoice Amount” >= 50K is 7% versus 3% otherwise.

  • The KPI violation rate when “Country” is “Canada” is 2% versus 6% otherwise.

The first of these statements indicates a negative deviance: the subpopulation of cases where the Invoice Amount is above 50K needs further attention. The second statement indicates positive deviance: The cases in Canada are handled more efficiently than others, suggesting that there is something to learn from how cases are handled in Canada.

We can sort the findings based on several metrics that are common in the field of risk analysis:

  • Violation rate: The percentage of cases in the KPI’s population that fulfill the condition and violate the KPI. In the first of the above examples, the conditional KPI violation rate for the condition “Country” with the value “Canada” is 2%.

  • Inverse violation rate: The percentage of cases in the KPI’s population that do NOT fulfill the condition and violate the KPI. In the first of the above examples, the inverse KPI violation rate for the condition “Country” is “Canada” is 6%.

  • Risk increase: The difference between the KPI violation rate and the inverse KPI violation rate. A risk increase of 5% tells us that if a case fulfills the condition, the probability of this case violating the KPI is five percentage points higher than if it does not.

  • Risk ratio: The ratio of the KPI violation rate to the inverse KPI violation rate. A risk ratio of 2 tells us that cases that fulfill the condition are twice more likely to violate the KPI than cases that do not.

  • Sub-population: The number of cases in the KPI population that fulfill the condition.

  • Sub-population percentage: The percentage of the subpopulation size to the KPI’s population.

A finding is negative if the risk increase is positive. In this case, the finding tells us that a condition under which the KPI violation rate increases. Conversely, a finding is positive if the risk increase is negative (the finding suggests a condition under which the KPI violation rate decreases).

The Root Cause Analyzer retrieves the findings that are most “informative”. The informativeness of a finding is measured using decision tree mining techniques.

To investigate the possible root causes of KPI violations, go to KPIs & Metrics. Right-click the KPI and click Investigate.

In our example, we have defined a KPI, “Case duration KPI”, that flags cases with a case duration greater than 12 weeks.

New002

This opens the KPI Root Cause Analyzer window for the selected KPI.

New003

The KPI Root Cause Analyzer window displays a list of “findings”. Each finding is presented as a card with a textual description. The metrics of this finding are risk ratio, risk increase, subpopulation size, and subpopulation percentage. Note that the KPI violation rate and the inverse KPI violation rate are included in the description. This helps us understand specific contributing factors to the KPI violations.

We can sort the list of findings in ascending or descending order according to:

  • Violation rate

  • Sub-population

  • Risk ratio

To change how the findings are sorted, click the Sort by.

New004

Select the metric by which we wish to sort. To sort by sub-population in descending order, click Sub-population lowest to highest.

New005

The findings get sorted based on our selection.

New006

Filter root cause analysis findings

When we run a root cause analysis, the output may include a large number of findings, making it difficult to identify the findings that are most relevant to our investigation. To make it simpler to navigate through the output of root cause analysis, we can define filters on a set of findings returned by the analysis.

For instance, using filters, we can focus on the findings that have a subpopulation of 20% or higher. Specifically, we can filter the root cause analysis findings based on the following parameters:

  • Attributes and stats

  • Violation rate

  • Inverse violation rate

  • Risk increase

  • Risk ratio

For instance, if we filter by the violation rate, we may retain/remove findings where the percentage of cases that violate the KPI exceeds 50%. To learn about the definition of these parameters, see Perform root cause analysis.

To demonstrate, we have created a KPI that is currently not being met. To investigate, select the KPI, right-click, and click Investigate > All attributes and stats.

New007

This returns the findings of the root cause analyzer.

New008

By default, it displays causes where the sub-population is greater than 10% and is sorted from the highest risk increase.

New009

We can edit this filter. To edit, click the filter and change the value. Let’s say 20%. Click Apply.

New010

Now, only cases where the sub-population is greater than 20% are displayed.

New011

We can also filter the sub-population based on the number of cases, rather than the percentage of cases. To filter by number of cases, click the percentage sign and click Apply.

New012

We can filter based on any other attributes or stats. To add a filter, click Add filters.

New013

Select any of the stats. Say, we want to filter by violation rate. Select it.

New014

Now, we can create filters based on the violation rate.

New015

We can remove a single filter by clicking the X button on that condition.

New016

We can also clear all applied filters by clicking Clear.

New017

Note

Filters applied in the Root Cause Analyzer are not saved and will be reset once the Root Cause Analyzer window is closed.

Select attributes and statistics for root cause analysis

When conducting root cause analysis, it is often necessary to identify the factors that most significantly influence KPI performance. We can select the attributes or statistics we want to base those findings on.

We can base the root cause analysis findings on any combination of attributes or statistics, such as:

  • Case attributes: Any attribute in the log. Allows us to specify how the value of the attribute affects the KPI.

  • Activity frequency: How the frequency of a specific activity affects the KPI.

  • Resource frequency: How the frequency of a specific resource affects the KPI.

  • Case start date: How the time a case starts affects the KPI.

  • Case end date: How the time a case ends affects the KPI.

  • Longest activity: How the occurrence of the longest activity in the process affects the KPI

Imagine we have defined a KPI called Case Duration KPI, which measures whether process cases are completed within four weeks. Our results indicate that the KPI target is not being met, and we aim to determine whether the Supplier or the Type of Product influences KPI performance.

To investigate these potential root causes, in the KPI and Metrics Center, select the Case Duration KPI. Right-click and click Investigate > Custom.

New018

Select Type and Supplier and click Select to confirm.

New019

The Root Cause Analyzer displays results segmented by the chosen attributes. We observe that when Supplier = CSL and Type = Goods, the KPI is violated 17.8% of the time, compared with 15.3% otherwise.

New020 We can also perform the analysis using statistical factors. For example, if we want to see whether the start date of a case affects KPI performance, in the Custom Attributes selection window, choose Case start date.

New021

The results show that cases starting after March and before July violated the KPI 31% of the time, compared to 2% of cases that started at other times. This indicates a possible workload bottleneck or seasonal peak during that period.

New022