Thresholds and alerts for WMAS metrics
Notifications are generated by WMAS when metric thresholds are exceeded. These notifications are logged as warnings or errors, depending on how far the metric differs from the specified threshold.
This table shows the metrics and their corresponding notification rules, outlining how thresholds log warnings or errors in WMAS:
| Metric | Notification rules |
|---|---|
| Throughput |
If the throughput remains below the specified threshold continuously for 30 minutes or more, a warning is logged. If the throughput remains below the specified threshold for 60 minutes or more, an error is logged. The default threshold is 50%, but you can configure it within a range of 0% to 100%. Work units with a throughput value of -1 or an idle state are excluded from receiving notifications.
These are the special throughput values reported by WMAS under specific system conditions:
|
| Utilization rate | If the utilization rate remains below the specified threshold continuously for 30 minutes or more, a warning is logged. If the utilization rate remains below the specified threshold for 60 minutes or more, an error is logged. The default threshold is 50%, but you can configure it within a range of 0% to 100%. |
| Backlog rate | If the backlog rate exceeds the specified threshold of system capacity continuously for 30 minutes or more, a warning is logged. If the backlog rate exceeds the specified threshold of system capacity continuously for 60 minutes or more, an error is logged. The default threshold is 200%, but you can configure it within a range of 100% to 500%. |
| Failure rate |
If the failure rate remains below the specified threshold continuously for 30 minutes or more, a warning is logged. If the failure rate remains below the specified threshold for 60 minutes or more, an error is logged. The default threshold is 25%, but you can configure it within a range of 0% to 100%. Work units with a failure rate value of -1 or an idle state are excluded from receiving notifications.
These are the special failure rate values reported by WMAS under specific system conditions:
|
| Error rate |
If the error rate exceeds the specified threshold continuously for 30 minutes or more, a warning is logged. If the error rate exceeds the specified threshold continuously for 60 minutes or more, an error is logged. The default threshold is 10 errors per work unit, but you can configure it within a range of 1 to 1000 errors per work unit. Work units with an error rate value of -1 or an idle state are excluded from receiving notifications.
These are the special error rate values reported by WMAS under specific system conditions:
|
| Long running work units | A work unit is considered long running if it lasts at least 60 minutes and takes more than 25% longer than the average for similar work units. In this case, a warning is logged by WMAS, which escalates to an error if the elapsed time exceeds 50% above the average. |
| Long running activities | If an activity exceeds its defined threshold, up to ten prior executions are reviewed by WMAS to calculate the average elapsed time. If the current elapsed time exceeds this average by 25%, a warning is logged. If the current elapsed time exceeds this average by 50%, an error is logged. Although long-running activities may occur during high-volume periods, consistent or isolated delays may indicate a systemic issue that requires further investigation. |