Infrastructure Metrics

Monitor the health of the servers in your Devpilot workspace, including CPU, memory, disk, network, and uptime for your full server fleet.

The Infrastructure Metrics view in Devpilot is your single-pane-of-glass across every server connected to a workspace. It answers: which of my servers are healthy right now, and which ones need attention?

You open it from the sidebar under Monitoring > Infrastructure, at the route /dashboard/monitoring/infrastructure.

Server fleet overview

The page lists every server registered in your workspace as a status card. For each server you see:

The server name and current health status (healthy, warning, or critical).
CPU utilization as a percentage with a coloured progress bar.
Memory utilization as a percentage with a coloured progress bar.
Disk utilization as a percentage with a coloured progress bar.
Network I/O rate (for example 12 MB/s).

The colour of each bar shifts as pressure builds, so you can scan the list and immediately spot the server that is running hot:

Utilization	Colour	Meaning
Below 60%	Green	Plenty of headroom.
60% to 79%	Amber	Getting busy, worth keeping an eye on.
80% and above	Red	Under pressure, investigate.

The status badge on each card (healthy / warning / critical) rolls the individual resource signals into a single verdict so you can sort the fleet without reading every number.

What Devpilot collects per server

Under the hood, Devpilot's monitoring agent records time-series samples for each server you have enabled monitoring on. The Infrastructure dashboard aggregates the most recent samples across these streams.

For every server, Devpilot records CPU usage, CPU core count, per-core breakdown, total memory, used memory, free memory, available memory, and the 1, 5, and 15 minute load averages. Each sample is timestamped so you can see how load changes over time.

Devpilot tracks every mounted filesystem on the server: its mount point, total space, used space, available space, and usage percentage. This is what powers "disk space low" signals on heavily-used database or build servers.

For each network interface, Devpilot records bytes sent and received, packet counts, input/output errors, packet drops, and current upload and download rates. Unexpected error or drop counts are a reliable early sign of a network or NIC problem.

Uptime records capture whether the server was reachable at each check, how long it has been online, the response time of the probe, and a short status message if the probe failed. This is the raw data behind "is the server up?".

Uptime tracking

The Infrastructure view rolls uptime probes into each server's health status. If a server stops responding, its card flips to critical and the probe failure is recorded with a status message you can read for context.

Connect the server. Add the server in Servers > Add Server and install the Devpilot agent. Monitoring is enabled per-server via the Monitoring enabled toggle on the server page.

Watch the status badges. On the Infrastructure page, a server that transitions from healthy to warning or critical is your cue to investigate.

Open the server detail page. From Servers, open the affected server to see richer metric history, recent deployments, and terminal or provisioning logs.

Resource utilization at a glance

Use the fleet view to answer day-to-day capacity questions without leaving the dashboard:

Which server has the least disk headroom right now?
Is memory pressure building on the database host?
Is network throughput sustained, or a one-off spike?
Are any servers sitting at 80%+ CPU for extended periods?

Pair this with Monitoring > Alerts to get notified when a server crosses a threshold instead of having to poll the dashboard yourself.

The Infrastructure page currently surfaces the latest CPU, memory, disk, and network snapshot plus the overall status badge. Historical charts, per-interface drill-down, and per-disk breakdowns are coming soon; until then, use the raw metric streams exposed on each server's detail page.

Quota and capacity alerts

When a server crosses a capacity threshold (for example disk > 90%, memory > 85%, or repeated uptime probe failures), Devpilot raises an alert. Alerts appear in Monitoring > Alerts with severity, a short description, and the resource involved. You can acknowledge, silence, or resolve them there.

Suggested workflow

Scan the fleet. Open the Infrastructure page and look for any server not coloured green.

Investigate the hot server. Click through to the server's detail page to see which resource is under pressure.

Act on it. Free disk, resize the instance, restart a runaway process, or scale out. Devpilot's terminal and script tools are available on the same server page.

Confirm recovery. Come back to Infrastructure Metrics; the status badge should return to healthy once the next metric sample comes in.

Infrastructure Metrics

Server fleet overview

What Devpilot collects per server

Uptime tracking

Resource utilization at a glance

Quota and capacity alerts

Suggested workflow

Next steps

Application Metrics

Alerts

Logs

Error Tracking

On this page