routeros-scripts/doc/check-health.md
Christian Hesse c48509683c check-health: wording: load -> utilization
The load is defined as something different...

https://en.wikipedia.org/wiki/Load_(computing)

So let's update the wording and use 'utilization' instead.

---- ✂️ ----
🧮📈️ Health warning: CPU utilization

The average CPU utilization on MikroTik is at 76%!
---- ✂️ ----
🧮📉️ Health recovery: CPU utilization

The average CPU utilization on MikroTik decreased to 64%.
---- ✂️ ----
2023-02-14 20:24:06 +01:00

2.6 KiB
Raw Blame History

Notify about health state

⬅️ Go back to main README

Info: This script can not be used on its own but requires the base installation. See main README for details.

Description

This script is run from scheduler periodically, sending notification on health related events:

  • high CPU utilization
  • low available free RAM
  • voltage jumps up or down more than configured threshold
  • voltage drops below hard lower limit
  • power supply failed or recovered
  • temperature is above or below threshold

Note that bad initial state will not trigger an event.

Monitoring CPU utilization and available free RAM works on all devices. Other than that only sensors available in hardware can be checked. See what your hardware supports:

/system/health/print;

Sample notifications

CPU utilization

check-health notification cpu utilization high
check-health notification cpu utilization ok

Available free RAM

check-health notification free ram low
check-health notification free ram ok

Voltage

check-health notification voltage

Temperature

check-health notification temperature high
check-health notification temperature ok

PSU state

check-health notification psu fail
check-health notification psu ok

Requirements and installation

Just install the script and create a scheduler:

$ScriptInstallUpdate check-health;
/system/scheduler/add interval=1m name=check-health on-event="/system/script/run check-health;" start-time=startup;

Configuration

The configuration goes to global-config-overlay, these are the parameters:

  • CheckHealthTemperature: an array specifying temperature thresholds for sensors
  • CheckHealthVoltageLow: value (in volt*10) giving a hard lower limit
  • CheckHealthVoltagePercent: percentage value to trigger voltage jumps

Also notification settings are required for e-mail, matrix and/or telegram.


⬅️ Go back to main README
⬆️ Go back to top