Automating Resource Management with CPUMon Scripts and Dashboards
Effective resource management is essential to keep servers and applications responsive, cost-efficient, and stable. CPUMon (CPU Monitor) provides lightweight, scriptable CPU monitoring that you can integrate into automation workflows and dashboards to detect, respond to, and prevent capacity issues. This article shows a practical approach: collecting CPU metrics with CPUMon, automating responses with scripts, and visualizing results in a dashboard.
What CPUMon provides
- Real-time CPU metrics: per-core and aggregate usage, load average, process-level CPU consumption.
- Low overhead: designed to run frequently without affecting system performance.
- Script-friendly output: plain text or JSON output suitable for piping into automation tools.
Architecture overview
- CPUMon runs on each host and emits periodic metrics.
- A lightweight collector (cron, systemd timer, or agent) forwards metrics to a central store (time-series DB or log aggregator).
- Automation scripts subscribe to metrics and execute actions (scale services, restart processes, throttle jobs).
- Dashboards display historical and real-time views for operators.
Deploying CPUMon at scale
- Install CPUMon on each host (package or copied binary).
- Configure output format to JSON and set a collection interval (e.g., 10sā60s).
- Use a systemd service or cronjob to ensure continuous running:
ini
[Unit] Description=CPUMon service [Service] ExecStart=/usr/local/bin/cpumon –output json –interval 15 Restart=always [Install] WantedBy=multi-user.target
Collecting and forwarding metrics
- Forward over HTTP to a collector endpoint or write to a local log file that a log shipper (Fluentd, Filebeat) forwards.
- Example curl forward from a script:
bash
/opt/cpumon/cpumon –output json –interval 15 | while read -r line; do curl -s -X POST -H “Content-Type: application/json” –data “\(line</span><span class="token" style="color: rgb(163, 21, 21);">"</span><span> https://collector.example.local/metrics </span><span></span><span class="token" style="color: rgb(0, 0, 255);">done</span><span> </span></code></div></div></pre> <h3>Automation scripts: common use cases</h3> <ol> <li>Auto-scale compute instances when average CPU > 70% for 5 minutes.</li> <li>Restart or throttle runaway processes consuming > 90% CPU for 30s.</li> <li>Re-prioritize batch jobs during peak business hours.</li> <li>Notify on-call via Slack or PagerDuty with contextual metric snippets.</li> </ol> <p>Example: scale-up script (uses a cloud CLI)</p> <pre><div class="XG2rBS5V967VhGTCEN1k"><div class="nHykNMmtaaTJMjgzStID"><div class="HsT0RHFbNELC00WicOi8"><i><svg width="16" height="16" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill="currentColor" fill-rule="evenodd" clip-rule="evenodd" d="M15.434 7.51c.137.137.212.311.212.49a.694.694 0 0 1-.212.5l-3.54 3.5a.893.893 0 0 1-.277.18 1.024 1.024 0 0 1-.684.038.945.945 0 0 1-.302-.148.787.787 0 0 1-.213-.234.652.652 0 0 1-.045-.58.74.74 0 0 1 .175-.256l3.045-3-3.045-3a.69.69 0 0 1-.22-.55.723.723 0 0 1 .303-.52 1 1 0 0 1 .648-.186.962.962 0 0 1 .614.256l3.541 3.51Zm-12.281 0A.695.695 0 0 0 2.94 8a.694.694 0 0 0 .213.5l3.54 3.5a.893.893 0 0 0 .277.18 1.024 1.024 0 0 0 .684.038.945.945 0 0 0 .302-.148.788.788 0 0 0 .213-.234.651.651 0 0 0 .045-.58.74.74 0 0 0-.175-.256L4.994 8l3.045-3a.69.69 0 0 0 .22-.55.723.723 0 0 0-.303-.52 1 1 0 0 0-.648-.186.962.962 0 0 0-.615.256l-3.54 3.51Z"></path></svg></i><p class="li3asHIMe05JPmtJCytG wZ4JdaHxSAhGy1HoNVja cPy9QU4brI7VQXFNPEvF">bash</p></div><div class="CF2lgtGWtYUYmTULoX44"><button type="button" class="st68fcLUUT0dNcuLLB2_ ffON2NH02oMAcqyoh2UU MQCbz04ET5EljRmK3YpQ CPXAhl7VTkj2dHDyAYAf" data-copycode="true" role="button" aria-label="Copy Code"><svg viewBox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill="currentColor" fill-rule="evenodd" clip-rule="evenodd" d="M9.975 1h.09a3.2 3.2 0 0 1 3.202 3.201v1.924a.754.754 0 0 1-.017.16l1.23 1.353A2 2 0 0 1 15 8.983V14a2 2 0 0 1-2 2H8a2 2 0 0 1-1.733-1H4.183a3.201 3.201 0 0 1-3.2-3.201V4.201a3.2 3.2 0 0 1 3.04-3.197A1.25 1.25 0 0 1 5.25 0h3.5c.604 0 1.109.43 1.225 1ZM4.249 2.5h-.066a1.7 1.7 0 0 0-1.7 1.701v7.598c0 .94.761 1.701 1.7 1.701H6V7a2 2 0 0 1 2-2h3.197c.195 0 .387.028.57.083v-.882A1.7 1.7 0 0 0 10.066 2.5H9.75c-.228.304-.591.5-1 .5h-3.5c-.41 0-.772-.196-1-.5ZM5 1.75v-.5A.25.25 0 0 1 5.25 1h3.5a.25.25 0 0 1 .25.25v.5a.25.25 0 0 1-.25.25h-3.5A.25.25 0 0 1 5 1.75ZM7.5 7a.5.5 0 0 1 .5-.5h3V9a1 1 0 0 0 1 1h1.5v4a.5.5 0 0 1-.5.5H8a.5.5 0 0 1-.5-.5V7Zm6 2v-.017a.5.5 0 0 0-.13-.336L12 7.14V9h1.5Z"></path></svg>Copy Code</button><button type="button" class="st68fcLUUT0dNcuLLB2_ WtfzoAXPoZC2mMqcexgL ffON2NH02oMAcqyoh2UU MQCbz04ET5EljRmK3YpQ GnLX_jUB3Jn3idluie7R"><svg fill="none" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"><path fill="currentColor" fill-rule="evenodd" d="M20.618 4.214a1 1 0 0 1 .168 1.404l-11 14a1 1 0 0 1-1.554.022l-5-6a1 1 0 0 1 1.536-1.28l4.21 5.05L19.213 4.382a1 1 0 0 1 1.404-.168Z" clip-rule="evenodd"></path></svg>Copied</button></div></div><div class="mtDfw7oSa1WexjXyzs9y" style="color: var(--sds-color-text-01); font-family: var(--sds-font-family-monospace); direction: ltr; text-align: left; white-space: pre; word-spacing: normal; word-break: normal; font-size: var(--sds-font-size-label); line-height: 1.2em; tab-size: 4; hyphens: none; padding: var(--sds-space-x02, 8px) var(--sds-space-x04, 16px) var(--sds-space-x04, 16px); margin: 0px; overflow: auto; border: none; background: transparent;"><code class="language-bash" style="color: rgb(57, 58, 52); font-family: Consolas, "Bitstream Vera Sans Mono", "Courier New", Courier, monospace; direction: ltr; text-align: left; white-space: pre; word-spacing: normal; word-break: normal; font-size: 0.9em; line-height: 1.2em; tab-size: 4; hyphens: none;"><span class="token assign-left" style="color: rgb(54, 172, 170);">THRESHOLD</span><span class="token" style="color: rgb(57, 58, 52);">=</span><span class="token" style="color: rgb(54, 172, 170);">70</span><span> </span><span></span><span class="token assign-left" style="color: rgb(54, 172, 170);">WINDOW</span><span class="token" style="color: rgb(57, 58, 52);">=</span><span class="token" style="color: rgb(54, 172, 170);">300</span><span> </span><span class="token" style="color: rgb(0, 128, 0); font-style: italic;"># 5 minutes</span><span> </span><span></span><span class="token assign-left" style="color: rgb(54, 172, 170);">AVG_CPU</span><span class="token" style="color: rgb(57, 58, 52);">=</span><span class="token" style="color: rgb(54, 172, 170);">\)(query_metrics_api –metric cpu_usage –window \(WINDOW --aggregate avg</span><span class="token" style="color: rgb(54, 172, 170);">)</span><span> </span> <span></span><span class="token" style="color: rgb(0, 0, 255);">if</span><span> </span><span class="token" style="color: rgb(57, 58, 52);">[</span><span> </span><span class="token" style="color: rgb(163, 21, 21);">"</span><span class="token" style="color: rgb(54, 172, 170);">\)(echo ”\(AVG_CPU</span><span class="token" style="color: rgb(54, 172, 170);"> > </span><span class="token" style="color: rgb(54, 172, 170);">\)THRESHOLD“ | bc -l)” -eq 1 ]; then cloudctl compute scale –group web-frontend –delta 2 notify_slack “Scaled web-frontend by +2 due to high CPU: \({AVG_CPU}</span><span class="token" style="color: rgb(163, 21, 21);">%"</span><span> </span><span></span><span class="token" style="color: rgb(0, 0, 255);">fi</span><span> </span></code></div></div></pre> <p>Example: restart high-CPU process</p> <pre><div class="XG2rBS5V967VhGTCEN1k"><div class="nHykNMmtaaTJMjgzStID"><div class="HsT0RHFbNELC00WicOi8"><i><svg width="16" height="16" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill="currentColor" fill-rule="evenodd" clip-rule="evenodd" d="M15.434 7.51c.137.137.212.311.212.49a.694.694 0 0 1-.212.5l-3.54 3.5a.893.893 0 0 1-.277.18 1.024 1.024 0 0 1-.684.038.945.945 0 0 1-.302-.148.787.787 0 0 1-.213-.234.652.652 0 0 1-.045-.58.74.74 0 0 1 .175-.256l3.045-3-3.045-3a.69.69 0 0 1-.22-.55.723.723 0 0 1 .303-.52 1 1 0 0 1 .648-.186.962.962 0 0 1 .614.256l3.541 3.51Zm-12.281 0A.695.695 0 0 0 2.94 8a.694.694 0 0 0 .213.5l3.54 3.5a.893.893 0 0 0 .277.18 1.024 1.024 0 0 0 .684.038.945.945 0 0 0 .302-.148.788.788 0 0 0 .213-.234.651.651 0 0 0 .045-.58.74.74 0 0 0-.175-.256L4.994 8l3.045-3a.69.69 0 0 0 .22-.55.723.723 0 0 0-.303-.52 1 1 0 0 0-.648-.186.962.962 0 0 0-.615.256l-3.54 3.51Z"></path></svg></i><p class="li3asHIMe05JPmtJCytG wZ4JdaHxSAhGy1HoNVja cPy9QU4brI7VQXFNPEvF">bash</p></div><div class="CF2lgtGWtYUYmTULoX44"><button type="button" class="st68fcLUUT0dNcuLLB2_ ffON2NH02oMAcqyoh2UU MQCbz04ET5EljRmK3YpQ CPXAhl7VTkj2dHDyAYAf" data-copycode="true" role="button" aria-label="Copy Code"><svg viewBox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill="currentColor" fill-rule="evenodd" clip-rule="evenodd" d="M9.975 1h.09a3.2 3.2 0 0 1 3.202 3.201v1.924a.754.754 0 0 1-.017.16l1.23 1.353A2 2 0 0 1 15 8.983V14a2 2 0 0 1-2 2H8a2 2 0 0 1-1.733-1H4.183a3.201 3.201 0 0 1-3.2-3.201V4.201a3.2 3.2 0 0 1 3.04-3.197A1.25 1.25 0 0 1 5.25 0h3.5c.604 0 1.109.43 1.225 1ZM4.249 2.5h-.066a1.7 1.7 0 0 0-1.7 1.701v7.598c0 .94.761 1.701 1.7 1.701H6V7a2 2 0 0 1 2-2h3.197c.195 0 .387.028.57.083v-.882A1.7 1.7 0 0 0 10.066 2.5H9.75c-.228.304-.591.5-1 .5h-3.5c-.41 0-.772-.196-1-.5ZM5 1.75v-.5A.25.25 0 0 1 5.25 1h3.5a.25.25 0 0 1 .25.25v.5a.25.25 0 0 1-.25.25h-3.5A.25.25 0 0 1 5 1.75ZM7.5 7a.5.5 0 0 1 .5-.5h3V9a1 1 0 0 0 1 1h1.5v4a.5.5 0 0 1-.5.5H8a.5.5 0 0 1-.5-.5V7Zm6 2v-.017a.5.5 0 0 0-.13-.336L12 7.14V9h1.5Z"></path></svg>Copy Code</button><button type="button" class="st68fcLUUT0dNcuLLB2_ WtfzoAXPoZC2mMqcexgL ffON2NH02oMAcqyoh2UU MQCbz04ET5EljRmK3YpQ GnLX_jUB3Jn3idluie7R"><svg fill="none" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"><path fill="currentColor" fill-rule="evenodd" d="M20.618 4.214a1 1 0 0 1 .168 1.404l-11 14a1 1 0 0 1-1.554.022l-5-6a1 1 0 0 1 1.536-1.28l4.21 5.05L19.213 4.382a1 1 0 0 1 1.404-.168Z" clip-rule="evenodd"></path></svg>Copied</button></div></div><div class="mtDfw7oSa1WexjXyzs9y" style="color: var(--sds-color-text-01); font-family: var(--sds-font-family-monospace); direction: ltr; text-align: left; white-space: pre; word-spacing: normal; word-break: normal; font-size: var(--sds-font-size-label); line-height: 1.2em; tab-size: 4; hyphens: none; padding: var(--sds-space-x02, 8px) var(--sds-space-x04, 16px) var(--sds-space-x04, 16px); margin: 0px; overflow: auto; border: none; background: transparent;"><code class="language-bash" style="color: rgb(57, 58, 52); font-family: Consolas, "Bitstream Vera Sans Mono", "Courier New", Courier, monospace; direction: ltr; text-align: left; white-space: pre; word-spacing: normal; word-break: normal; font-size: 0.9em; line-height: 1.2em; tab-size: 4; hyphens: none;"><span class="token assign-left" style="color: rgb(54, 172, 170);">PID</span><span class="token" style="color: rgb(57, 58, 52);">=</span><span class="token" style="color: rgb(54, 172, 170);">\)(ps -eo pid,pcpu,comm –sort=-pcpu | awk ‘\(2>90{print \)1; exit}’) if [ -n ”\(PID</span><span class="token" style="color: rgb(163, 21, 21);">"</span><span> </span><span class="token" style="color: rgb(57, 58, 52);">]</span><span class="token" style="color: rgb(57, 58, 52);">;</span><span> </span><span class="token" style="color: rgb(0, 0, 255);">then</span><span> </span><span> </span><span class="token assign-left" style="color: rgb(54, 172, 170);">PROC</span><span class="token" style="color: rgb(57, 58, 52);">=</span><span class="token" style="color: rgb(54, 172, 170);">\)(ps -p \(PID -o </span><span class="token assign-left" style="color: rgb(54, 172, 170);">comm</span><span class="token" style="color: rgb(57, 58, 52);">=</span><span class="token" style="color: rgb(54, 172, 170);">)</span><span> </span><span> systemctl restart </span><span class="token" style="color: rgb(163, 21, 21);">"</span><span class="token" style="color: rgb(54, 172, 170);">\)PROC“ || kill -9 ”\(PID</span><span class="token" style="color: rgb(163, 21, 21);">"</span><span> </span><span> logger </span><span class="token" style="color: rgb(163, 21, 21);">"Restarted </span><span class="token" style="color: rgb(54, 172, 170);">\)PROC (pid $PID) for excessive CPU” fi
Dashboards: what to show
- Overview: cluster average CPU
Leave a Reply