Why Monitor Nextcloud?
Running Nextcloud in production is about more than just deploying the application and walking away. Your users depend on it for file synchronization, collaboration, and communication. When something goes wrong — a disk fills up, memory runs out, or response times spike — you need to know about it before your users start filing tickets.
Monitoring transforms Nextcloud administration from reactive firefighting into proactive management. Instead of discovering that storage ran out at 3 AM from an angry email, you get an alert when free space drops below 15%, giving you hours or days to respond. Instead of guessing why the server feels slow, you have dashboards showing exactly which resource is the bottleneck.
The combination of Prometheus and Grafana has become the industry standard for infrastructure monitoring. Prometheus collects and stores time-series metrics with a powerful query language, while Grafana transforms those metrics into actionable dashboards and alerts. Together with purpose-built Nextcloud exporters, they give you complete visibility into your Nextcloud deployment.
This guide walks you through the complete setup — from enabling the Nextcloud metrics API to building production-ready dashboards and alerting rules. Whether you are running a small team instance or an enterprise deployment serving thousands of users, the monitoring stack described here scales to match.
Monitoring Architecture Overview
Before diving into configuration, it helps to understand how the monitoring components connect and how data flows through the system.
Components
- Nextcloud — Your application, which exposes metrics through its Server Info API
- Nextcloud Exporter — A lightweight service that queries the Nextcloud API and translates metrics into Prometheus format
- Node Exporter — Collects OS-level metrics (CPU, memory, disk, network) from the host system
- Prometheus — Time-series database that scrapes metrics from exporters at regular intervals, stores them, and evaluates alerting rules
- Grafana — Visualization platform that queries Prometheus and renders dashboards, also handles alert notifications
Data Flow
┌─────────────────┐ ┌──────────────────────┐ ┌──────────────┐ ┌──────────────┐
│ │ │ │ │ │ │ │
│ Nextcloud │─────▶│ Nextcloud Exporter │◀─────│ Prometheus │─────▶│ Grafana │
│ (Server Info │ HTTP │ (port 9205) │scrape│ (port 9090) │query │ (port 3000) │
│ API) │ │ │ │ │ │ │
└─────────────────┘ └──────────────────────┘ │ │ │ │
│ │ │ Dashboards │
┌─────────────────┐ │ │ │ Alerts │
│ │ │ │ │ │
│ Node Exporter │◀───────────────────────────────────│ │ │ │
│ (port 9100) │ scrape │ │ │ │
│ │ │ │ │ │
└─────────────────┘ └──────────────┘ └──────────────┘
│
▼
┌──────────────┐
│ Alertmanager│
│ (port 9093) │
│ Email/Slack │
└──────────────┘
Prometheus operates on a pull model — it actively scrapes each exporter endpoint at configured intervals (typically every 15–60 seconds). This design means exporters do not need to know where Prometheus is; they simply expose an HTTP endpoint with current metrics. Prometheus stores these data points with timestamps, enabling both real-time monitoring and historical trend analysis.
Prerequisites
Before starting the monitoring setup, ensure you have the following in place:
Nextcloud Instance
- A working Nextcloud installation (version 20 or later recommended). If you are starting fresh, follow our production Nextcloud installation guide first.
- Administrative access to Nextcloud (for enabling the Server Info app and creating a monitoring user)
- HTTPS enabled with a valid SSL certificate
Server Requirements for the Monitoring Stack
- CPU: 2 vCPUs minimum (Prometheus is CPU-intensive during compaction)
- RAM: 2 GB minimum, 4 GB recommended (Prometheus keeps recent data in memory)
- Disk: 20 GB minimum for Prometheus data retention (plan ~1.5 MB per day per target with default metrics)
- OS: Ubuntu 22.04/24.04, Debian 12, or AlmaLinux 9
Deployment note: You can run the monitoring stack on the same server as Nextcloud for small deployments, but dedicated monitoring infrastructure is recommended for production. This keeps resource contention from affecting either service and ensures monitoring remains available even if the Nextcloud server has issues.
Network Considerations
- Port 9090 — Prometheus web UI and API
- Port 9100 — Node Exporter metrics endpoint
- Port 9205 — Nextcloud Exporter metrics endpoint
- Port 3000 — Grafana web UI
- Port 9093 — Alertmanager (if using separate alerting)
If Prometheus runs on a different server than Nextcloud, ensure the firewall allows Prometheus to reach the exporter ports on the Nextcloud host. Use a private network or VPN for exporter traffic — never expose raw metrics endpoints to the public internet.
Step 1: Enable the Nextcloud Server Info API
Nextcloud includes a built-in serverinfo app that exposes system metrics through an API endpoint. This is the data source that the Nextcloud Exporter will query.
Enable the Server Info App
The serverinfo app is typically enabled by default, but verify and enable it if needed:
# Check if serverinfo is enabled
sudo -u www-data php /var/www/nextcloud/occ app:list | grep serverinfo
# Enable if not already active
sudo -u www-data php /var/www/nextcloud/occ app:enable serverinfo
Verify the API Endpoint
Test the API endpoint to confirm it returns data:
# Test with curl using admin credentials
curl -s -u admin:YOUR_PASSWORD \
"https://your-nextcloud.example.com/ocs/v2.php/apps/serverinfo/api/v1/info?format=json" \
-H "OCS-APIREQUEST: true" | python3 -m json.tool
You should see a JSON response containing sections for system, storage, shares, server, and activeUsers.
Create a Dedicated Monitoring User
Do not use your admin account for monitoring. Create a dedicated user with minimal permissions:
# Create monitoring user
sudo -u www-data php /var/www/nextcloud/occ user:add \
--display-name="Monitoring" \
--group="admin" \
monitoring
# Set a strong password when prompted
# The user needs admin group membership to access serverinfo API
Security note: The
serverinfoAPI requires admin-level access. If you want to minimize the monitoring user's permissions, consider using an app password instead of the main account password. Generate one from Nextcloud Settings > Security > Devices & sessions. For comprehensive security practices, see our Nextcloud security hardening guide.
Step 2: Install and Configure the Nextcloud Exporter
The Nextcloud Exporter is a lightweight Go application that queries the Nextcloud Server Info API and exposes the metrics in Prometheus format.
Option A: Install from Binary
# Download the latest release
wget https://github.com/xperimental/nextcloud-exporter/releases/download/v0.7.0/nextcloud-exporter-v0.7.0-linux-amd64.tar.gz
# Extract the binary
tar xzf nextcloud-exporter-v0.7.0-linux-amd64.tar.gz
# Move to system path
sudo mv nextcloud-exporter /usr/local/bin/
sudo chmod +x /usr/local/bin/nextcloud-exporter
Option B: Deploy with Docker
# Run as a Docker container
docker run -d \
--name nextcloud-exporter \
--restart unless-stopped \
-p 9205:9205 \
-e NEXTCLOUD_SERVER="https://your-nextcloud.example.com" \
-e NEXTCLOUD_USERNAME="monitoring" \
-e NEXTCLOUD_PASSWORD="your-secure-password" \
ghcr.io/xperimental/nextcloud-exporter:latest
Create a Systemd Service (Binary Install)
For the binary installation, create a systemd service file for reliable operation:
# /etc/systemd/system/nextcloud-exporter.service
[Unit]
Description=Nextcloud Exporter for Prometheus
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=nextcloud-exporter
Group=nextcloud-exporter
ExecStart=/usr/local/bin/nextcloud-exporter \
--server https://your-nextcloud.example.com \
--username monitoring \
--password your-secure-password \
--listen-address :9205 \
--timeout 30s
Restart=on-failure
RestartSec=5
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
[Install]
WantedBy=multi-user.target
# Create the service user and start the exporter
sudo useradd --system --no-create-home --shell /usr/sbin/nologin nextcloud-exporter
sudo systemctl daemon-reload
sudo systemctl enable --now nextcloud-exporter
Verify the Metrics Endpoint
# Check that the exporter is serving metrics
curl -s http://localhost:9205/metrics | head -30
# You should see lines like:
# nextcloud_system_info{version="28.0.4"} 1
# nextcloud_users_total 42
# nextcloud_files_total 128456
# nextcloud_storage_free_bytes 5.3687091e+10
Configuration Options Reference
| Flag / Environment Variable | Description | Default |
|---|---|---|
--server / NEXTCLOUD_SERVER |
Nextcloud instance URL | (required) |
--username / NEXTCLOUD_USERNAME |
API username | (required) |
--password / NEXTCLOUD_PASSWORD |
API password or app token | (required) |
--listen-address |
Address and port to listen on | :9205 |
--timeout |
HTTP request timeout | 5s |
--tls-skip-verify |
Skip TLS certificate verification | false |
Step 3: Prometheus Configuration
With the exporter running, configure Prometheus to scrape metrics from both the Nextcloud Exporter and the Node Exporter.
Install Prometheus
# Download Prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.51.0/prometheus-2.51.0.linux-amd64.tar.gz
tar xzf prometheus-2.51.0.linux-amd64.tar.gz
sudo mv prometheus-2.51.0.linux-amd64/prometheus /usr/local/bin/
sudo mv prometheus-2.51.0.linux-amd64/promtool /usr/local/bin/
# Create directories
sudo mkdir -p /etc/prometheus /var/lib/prometheus
sudo useradd --system --no-create-home --shell /usr/sbin/nologin prometheus
sudo chown prometheus:prometheus /var/lib/prometheus
Prometheus Configuration File
Create the main configuration at /etc/prometheus/prometheus.yml:
# /etc/prometheus/prometheus.yml
global:
scrape_interval: 30s # How often to scrape targets
evaluation_interval: 30s # How often to evaluate alerting rules
scrape_timeout: 15s # Timeout for each scrape request
external_labels:
environment: production
service: nextcloud
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- localhost:9093
# Load alerting rules
rule_files:
- /etc/prometheus/rules/*.yml
# Scrape configurations
scrape_configs:
# Prometheus self-monitoring
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
# Nextcloud application metrics
- job_name: "nextcloud"
scrape_interval: 60s # Nextcloud API can be slow; use longer interval
scrape_timeout: 30s
static_configs:
- targets: ["localhost:9205"]
labels:
instance: "nextcloud-prod"
datacenter: "nyc1"
# Node Exporter for OS-level metrics
- job_name: "node"
static_configs:
- targets: ["localhost:9100"]
labels:
instance: "nextcloud-prod"
datacenter: "nyc1"
Scrape Interval Recommendations
- Nextcloud Exporter (60s): The Server Info API queries the database on every call. A 60-second interval balances freshness against load. For large instances with many users, consider 120s.
- Node Exporter (30s): OS-level metrics are cheap to collect. A 30-second interval gives good resolution for resource monitoring.
- Prometheus self-monitoring (30s): Track Prometheus own health at the default interval.
Create the Prometheus Systemd Service
# /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Monitoring System
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=prometheus
Group=prometheus
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus \
--storage.tsdb.retention.time=90d \
--web.listen-address=:9090 \
--web.enable-lifecycle
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartSec=5
NoNewPrivileges=true
[Install]
WantedBy=multi-user.target
# Start Prometheus
sudo systemctl daemon-reload
sudo systemctl enable --now prometheus
# Verify it is running
curl -s http://localhost:9090/-/healthy
# Expected: Prometheus Server is Healthy.
Validate Configuration
# Check configuration syntax before restarting
promtool check config /etc/prometheus/prometheus.yml
# Verify targets are being scraped
curl -s http://localhost:9090/api/v1/targets | python3 -m json.tool
Step 4: Key Metrics to Monitor
Understanding which metrics matter is critical to building effective dashboards and alerts. The Nextcloud Exporter exposes dozens of metrics, but these are the ones you should focus on.
System Health Metrics
| Metric Name | Description | Alert Threshold |
|---|---|---|
nextcloud_system_cpuload |
System CPU load averages (1m, 5m, 15m) | > number of CPU cores |
nextcloud_system_mem_total_bytes |
Total system memory | — |
nextcloud_system_mem_free_bytes |
Available system memory | < 10% of total |
nextcloud_system_swap_total_bytes |
Total swap space | — |
nextcloud_system_swap_free_bytes |
Available swap space | < 20% of total |
Storage Metrics
| Metric Name | Description | Alert Threshold |
|---|---|---|
nextcloud_storage_free_bytes |
Free storage space on data directory | < 10% of total or < 10 GB |
nextcloud_storage_num_files |
Total number of files managed | Trend monitoring only |
nextcloud_storage_num_storages |
Number of configured storage backends | — |
nextcloud_storage_num_storages_local |
Number of local storage mounts | — |
nextcloud_storage_num_storages_other |
Number of external storage mounts | — |
User Activity Metrics
| Metric Name | Description | Alert Threshold |
|---|---|---|
nextcloud_users_total |
Total registered users | Trend / license limits |
nextcloud_active_users_last5min |
Users active in last 5 minutes | Anomaly detection |
nextcloud_active_users_last1hour |
Users active in last hour | Capacity planning |
nextcloud_active_users_last24hours |
Users active in last 24 hours | Engagement tracking |
nextcloud_shares_num_fed_shares_received |
Federated shares received | — |
nextcloud_shares_num_fed_shares_sent |
Federated shares sent | — |
Performance Metrics
| Metric Name | Description | Alert Threshold |
|---|---|---|
nextcloud_php_opcache_hit_rate |
PHP OPcache hit rate percentage | < 95% |
nextcloud_php_memory_limit_bytes |
PHP memory limit | < 512 MB |
nextcloud_php_max_execution_time |
PHP max execution time | — |
nextcloud_php_upload_max_size_bytes |
Maximum upload file size | — |
nextcloud_database_size_bytes |
Nextcloud database size | Growth rate monitoring |
Application Metrics
| Metric Name | Description | Alert Threshold |
|---|---|---|
nextcloud_system_info |
Nextcloud version information (label) | Version change detection |
nextcloud_apps_installed |
Number of installed apps | — |
nextcloud_apps_updates_available |
Number of apps with pending updates | > 0 for extended period |
nextcloud_up |
Whether Nextcloud is reachable (1 = up, 0 = down) | == 0 |
For a deeper dive into the performance-related metrics and how to tune the settings they reflect, see our Nextcloud performance tuning guide.
Step 5: Grafana Dashboard Setup
Grafana transforms raw Prometheus metrics into visual dashboards that make it easy to assess system health at a glance.
Install Grafana
# Add the Grafana repository (Ubuntu/Debian)
sudo apt-get install -y apt-transport-https software-properties-common
sudo mkdir -p /etc/apt/keyrings/
wget -q -O - https://apt.grafana.com/gpg.key | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null
echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
# Install and start Grafana
sudo apt-get update
sudo apt-get install -y grafana
sudo systemctl enable --now grafana-server
Add Prometheus as a Data Source
- Open Grafana at
http://your-server:3000(default credentials: admin/admin) - Navigate to Connections > Data Sources > Add data source
- Select Prometheus
- Set the URL to
http://localhost:9090 - Click Save & Test to verify connectivity
You can also configure the data source via provisioning for reproducible setups:
# /etc/grafana/provisioning/datasources/prometheus.yml
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://localhost:9090
isDefault: true
editable: false
Community Dashboard vs Custom
The Grafana community maintains several Nextcloud dashboards you can import directly. Search for "Nextcloud" on grafana.com/grafana/dashboards and import by ID. However, community dashboards often include outdated metrics or miss important panels. We recommend starting with a community dashboard and then customizing it.
Key Dashboard Panels
Overview Row: System Status at a Glance
Create stat panels for the most critical indicators:
# Active Users (last 5 minutes) — Stat panel
nextcloud_active_users_last5min
# Nextcloud Up/Down — Stat panel with value mapping (1=Up, 0=Down)
nextcloud_up
# Total Storage Used — Stat panel
nextcloud_storage_free_bytes
# Total Files Managed — Stat panel
nextcloud_storage_num_files
# Nextcloud Version — Stat panel (use label value)
nextcloud_system_info
Performance Row: Response Times and PHP Metrics
# PHP OPcache Hit Rate — Gauge panel (target: >95%)
nextcloud_php_opcache_hit_rate
# System CPU Load (1 minute) — Time series panel
nextcloud_system_cpuload{period="1"}
# Memory Usage Percentage — Time series panel
(1 - (nextcloud_system_mem_free_bytes / nextcloud_system_mem_total_bytes)) * 100
# Database Size Over Time — Time series panel
nextcloud_database_size_bytes
Storage Row: Disk Usage Trends
# Free Disk Space Over Time — Time series panel with threshold line
nextcloud_storage_free_bytes
# Disk Usage Percentage — Gauge panel
# (Requires node_exporter metrics for total disk size)
(1 - (node_filesystem_avail_bytes{mountpoint="/var/www/nextcloud/data"} / node_filesystem_size_bytes{mountpoint="/var/www/nextcloud/data"})) * 100
# File Count Growth — Time series panel
nextcloud_storage_num_files
# Predicted Disk Full — Stat panel
# Uses linear regression to predict when disk runs out
predict_linear(node_filesystem_avail_bytes{mountpoint="/var/www/nextcloud/data"}[7d], 30 * 24 * 3600) / (1024^3)
User Activity Row: Login Patterns and Operations
# Active Users Over Time (5min, 1hr, 24hr) — Time series panel
nextcloud_active_users_last5min
nextcloud_active_users_last1hour
nextcloud_active_users_last24hours
# Total Users — Stat panel
nextcloud_users_total
# Shares Overview — Bar gauge panel
nextcloud_shares_num_fed_shares_sent
nextcloud_shares_num_fed_shares_received
Dashboard JSON Export
Once you have built your dashboard, export it as JSON from Dashboard Settings > JSON Model. Store this JSON in version control alongside your infrastructure code. This enables you to recreate the dashboard automatically in disaster recovery scenarios. For backup strategies that include monitoring configuration, see our Nextcloud backup and disaster recovery guide.
Step 6: Alerting Configuration
Dashboards are for humans watching screens. Alerts are for when nobody is watching. A well-configured alerting system ensures critical issues get immediate attention.
Prometheus Alerting Rules
Create alerting rules at /etc/prometheus/rules/nextcloud.yml:
# /etc/prometheus/rules/nextcloud.yml
groups:
- name: nextcloud_alerts
interval: 60s
rules:
# Nextcloud instance is down
- alert: NextcloudDown
expr: nextcloud_up == 0
for: 3m
labels:
severity: critical
annotations:
summary: "Nextcloud instance is unreachable"
description: "The Nextcloud exporter has been unable to reach the instance for more than 3 minutes."
runbook_url: "https://wiki.internal/runbooks/nextcloud-down"
# Low disk space on data directory
- alert: NextcloudDiskSpaceLow
expr: nextcloud_storage_free_bytes < 10737418240
for: 5m
labels:
severity: warning
annotations:
summary: "Nextcloud storage free space below 10 GB"
description: "Free space is {{ $value | humanize1024 }}. Investigate and expand storage or clean up old files."
# Critical disk space
- alert: NextcloudDiskSpaceCritical
expr: nextcloud_storage_free_bytes < 2147483648
for: 2m
labels:
severity: critical
annotations:
summary: "Nextcloud storage free space below 2 GB"
description: "Free space is {{ $value | humanize1024 }}. Immediate action required to prevent service disruption."
# High memory usage
- alert: NextcloudHighMemoryUsage
expr: (1 - (nextcloud_system_mem_free_bytes / nextcloud_system_mem_total_bytes)) * 100 > 90
for: 10m
labels:
severity: warning
annotations:
summary: "Nextcloud server memory usage above 90%"
description: "Memory usage has been at {{ $value | printf \"%.1f\" }}% for more than 10 minutes."
# PHP OPcache hit rate low
- alert: NextcloudOPcacheHitRateLow
expr: nextcloud_php_opcache_hit_rate < 90
for: 15m
labels:
severity: warning
annotations:
summary: "PHP OPcache hit rate below 90%"
description: "OPcache hit rate is {{ $value | printf \"%.1f\" }}%. Consider increasing opcache.max_accelerated_files or opcache.memory_consumption."
# High CPU load
- alert: NextcloudHighCPULoad
expr: nextcloud_system_cpuload{period="5"} > 4
for: 15m
labels:
severity: warning
annotations:
summary: "Nextcloud server CPU load is high"
description: "5-minute CPU load average is {{ $value | printf \"%.2f\" }}. Check for runaway cron jobs or heavy user activity."
# App updates available (informational)
- alert: NextcloudAppUpdatesAvailable
expr: nextcloud_apps_updates_available > 0
for: 24h
labels:
severity: info
annotations:
summary: "{{ $value }} Nextcloud app updates are available"
description: "App updates have been pending for more than 24 hours. Review and apply updates during maintenance window."
- name: ssl_alerts
rules:
# SSL certificate expiry (requires blackbox exporter or probe)
- alert: SSLCertificateExpiringSoon
expr: probe_ssl_earliest_cert_expiry - time() < 14 * 24 * 3600
for: 1h
labels:
severity: warning
annotations:
summary: "SSL certificate expires in less than 14 days"
description: "Certificate for {{ $labels.instance }} expires in {{ $value | humanizeDuration }}."
Validate Alerting Rules
# Check rule syntax
promtool check rules /etc/prometheus/rules/nextcloud.yml
# Reload Prometheus to pick up new rules
curl -X POST http://localhost:9090/-/reload
Alertmanager Configuration
Install and configure Alertmanager to route alerts to the appropriate channels:
# /etc/alertmanager/alertmanager.yml
global:
resolve_timeout: 5m
smtp_smarthost: 'smtp.example.com:587'
smtp_from: 'alerts@example.com'
smtp_auth_username: 'alerts@example.com'
smtp_auth_password: 'smtp-password'
smtp_require_tls: true
route:
group_by: ['alertname', 'severity']
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
receiver: 'default-email'
routes:
# Critical alerts go to Slack immediately
- match:
severity: critical
receiver: 'slack-critical'
group_wait: 10s
repeat_interval: 1h
# Info alerts aggregated and sent via email
- match:
severity: info
receiver: 'default-email'
group_wait: 1h
repeat_interval: 24h
receivers:
- name: 'default-email'
email_configs:
- to: 'ops-team@example.com'
send_resolved: true
- name: 'slack-critical'
slack_configs:
- api_url: 'https://hooks.slack.com/services/YOUR/WEBHOOK/URL'
channel: '#nextcloud-alerts'
title: '{{ .CommonAnnotations.summary }}'
text: '{{ .CommonAnnotations.description }}'
send_resolved: true
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname']
Grafana Alert Notifications
You can also configure alert rules directly in Grafana as an alternative or supplement to Prometheus alerting rules. Grafana supports notification channels including email, Slack, Microsoft Teams, PagerDuty, OpsGenie, and generic webhooks. Configure these under Alerting > Contact points in the Grafana UI.
Step 7: Node Exporter for System Metrics
While the Nextcloud Exporter provides application-level metrics, Node Exporter gives you the complete picture of the underlying operating system. Many Nextcloud performance issues stem from system-level resource exhaustion that only Node Exporter can reveal.
Install Node Exporter
# Download and install
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.1/node_exporter-1.8.1.linux-amd64.tar.gz
tar xzf node_exporter-1.8.1.linux-amd64.tar.gz
sudo mv node_exporter-1.8.1.linux-amd64/node_exporter /usr/local/bin/
# Create systemd service
sudo useradd --system --no-create-home --shell /usr/sbin/nologin node_exporter
# /etc/systemd/system/node_exporter.service
[Unit]
Description=Prometheus Node Exporter
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=node_exporter
Group=node_exporter
ExecStart=/usr/local/bin/node_exporter \
--collector.filesystem.mount-points-exclude="^/(sys|proc|dev|host|etc)($$|/)" \
--collector.netclass.ignored-devices="^(veth.*|br.*|docker.*|lo)$$" \
--web.listen-address=:9100
Restart=on-failure
RestartSec=5
NoNewPrivileges=true
[Install]
WantedBy=multi-user.target
# Enable and start
sudo systemctl daemon-reload
sudo systemctl enable --now node_exporter
Key System Metrics for Nextcloud
These Node Exporter metrics directly affect Nextcloud performance and should be included in your dashboards:
node_cpu_seconds_total— CPU utilization per core. Highiowaitindicates disk bottlenecks that slow file uploads/downloads.node_memory_MemAvailable_bytes— Actual available memory (accounts for buffers/cache). More accurate thanMemFree.node_filesystem_avail_bytes— Available disk space per mount point. Monitor both the Nextcloud data directory and the system root.node_disk_io_time_seconds_total— Disk I/O utilization. Values consistently near 100% indicate the disk is saturated.node_network_receive_bytes_totalandnode_network_transmit_bytes_total— Network throughput. Useful for detecting sync storms or DDoS patterns.node_filefd_allocated— Open file descriptors. Nextcloud can exhaust file descriptors under heavy concurrent access.
Combining Nextcloud and System Metrics
The real power emerges when you correlate Nextcloud metrics with system metrics in the same dashboard. For example:
- A spike in
nextcloud_active_users_last5mincorrelating with highnode_cpu_seconds_totaltells you the server needs more CPU for your user count. - Growing
nextcloud_database_size_bytesalongside increasingnode_disk_io_time_seconds_totalsuggests it is time to move the database to faster storage. - Low
nextcloud_php_opcache_hit_ratecombined with highnode_memory_MemAvailable_bytesmeans you have memory to spare — increase OPcache allocation.
This correlation approach is especially important when you need to decide whether to optimize your existing deployment or migrate to a different architecture.
Advanced Monitoring
Once the core monitoring stack is in place, consider these extensions for more comprehensive observability.
Log Aggregation with Loki
Metrics tell you something is wrong; logs tell you why. Grafana Loki is a log aggregation system designed to work seamlessly with Grafana. Install Promtail on your Nextcloud server to ship logs to Loki, then correlate log events with metric anomalies in the same Grafana dashboard.
Key log files to monitor:
/var/www/nextcloud/data/nextcloud.log— Application-level errors and warnings/var/log/nginx/error.logor/var/log/apache2/error.log— Web server errors/var/log/mysql/error.logor PostgreSQL logs — Database errors/var/log/php-fpm/*.log— PHP process manager logs
Uptime Monitoring and Synthetic Checks
Add the Blackbox Exporter to perform synthetic HTTP checks against your Nextcloud instance. This catches issues that internal metrics might miss, such as DNS resolution failures, certificate problems, or reverse proxy misconfigurations.
# Prometheus scrape config for Blackbox Exporter
- job_name: "blackbox-nextcloud"
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets:
- https://your-nextcloud.example.com/status.php
- https://your-nextcloud.example.com/login
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: localhost:9115
Database-Specific Monitoring
If you use MySQL/MariaDB, add the mysqld_exporter. For PostgreSQL, use the postgres_exporter. These provide granular database metrics including query performance, connection pool utilization, replication lag, and lock contention — all of which directly impact Nextcloud responsiveness.
# Key database metrics to watch:
# MySQL: mysql_global_status_slow_queries, mysql_global_status_connections
# PostgreSQL: pg_stat_activity_count, pg_stat_user_tables_seq_scan
Scaling Monitoring for Multi-Node Setups
For high-availability Nextcloud deployments with multiple application nodes, load balancers, and clustered databases:
- Run Node Exporter on every server in the cluster
- Run the Nextcloud Exporter against each application node separately
- Use Prometheus federation or Thanos for multi-site monitoring
- Add labels for
node,role, anddatacenterto distinguish metrics from different cluster members - Create separate dashboard rows for per-node and cluster-aggregate views
Monitoring Nextcloud on MassiveGRID
Setting up and maintaining a monitoring stack takes time and expertise. If you are running Nextcloud on MassiveGRID's managed cloud infrastructure, much of this monitoring comes built-in.
Built-in Infrastructure Monitoring
MassiveGRID's managed hosting platform includes 24/7 infrastructure monitoring for every deployment. This covers server health, network performance, storage utilization, and hardware fault detection — all handled by the MassiveGRID operations team without any configuration on your part.
24/7 NOC Team for Alert Response
When alerts fire at 3 AM, MassiveGRID's Network Operations Center is already watching. The NOC team responds to infrastructure alerts around the clock, handling issues like disk space expansion, memory allocation adjustments, and failover orchestration. This means your on-call engineers only get paged for application-level issues that require domain-specific knowledge.
Managed Monitoring as Part of the Service
With MassiveGRID, you still have full access to deploy your own Prometheus and Grafana stack for application-level Nextcloud metrics. But the infrastructure layer — the part that requires constant vigilance and rapid response — is handled for you. This hybrid approach gives you the best of both worlds: deep application visibility with your custom dashboards, plus reliable infrastructure monitoring from a team that does it professionally.
For organizations that want to focus on their Nextcloud deployment without worrying about the underlying infrastructure monitoring, MassiveGRID provides a complete solution.
Get Fully Monitored Nextcloud Hosting
Deploy Nextcloud on MassiveGRID's managed cloud with 24/7 infrastructure monitoring, automated alerts, and NOC support included.
Explore Nextcloud HostingTroubleshooting Common Issues
Even with careful setup, monitoring systems occasionally need debugging. Here are the most common issues and their solutions.
Exporter Connection Refused
Symptom: Prometheus shows the Nextcloud target as DOWN with a "connection refused" error.
- Verify the exporter is running:
systemctl status nextcloud-exporter - Check the exporter is listening on the expected port:
ss -tlnp | grep 9205 - Confirm firewall allows traffic:
sudo ufw statusorsudo iptables -L -n - Test connectivity from the Prometheus server:
curl http://nextcloud-host:9205/metrics
Metrics Not Updating
Symptom: Grafana dashboards show stale data or flat lines.
- Check the exporter logs for authentication errors:
journalctl -u nextcloud-exporter -f - Verify the monitoring user credentials are still valid (passwords expire, app tokens get revoked)
- Confirm the Nextcloud Server Info API returns fresh data: query the API directly with curl
- Check Prometheus scrape errors: navigate to
http://prometheus:9090/targetsand inspect the "Last Scrape" and "Error" columns
Grafana Dashboard Showing "No Data"
Symptom: Panels display "No data" or "N/A" despite Prometheus having data.
- Verify the data source is configured correctly in Grafana (Connections > Data Sources > test)
- Check the time range selector — it might be set to a period before monitoring was configured
- Test the PromQL query directly in Prometheus UI (
http://prometheus:9090/graph) to confirm data exists - Ensure metric names match exactly (Nextcloud Exporter metric names may change between versions)
- Check for label mismatches in queries that use label selectors like
{instance="..."}
High Cardinality Warnings
Symptom: Prometheus logs warnings about high cardinality or runs out of memory.
- This typically occurs when custom exporters or applications generate metrics with unbounded label values (e.g., user IDs, file paths)
- The standard Nextcloud Exporter does not produce high-cardinality metrics, so check other exporters in your setup
- Use
promtool tsdb analyze /var/lib/prometheusto identify which metrics have the most series - Consider adding
metric_relabel_configsto drop unnecessary high-cardinality labels at scrape time
Exporter Timing Out
Symptom: Prometheus marks the Nextcloud target as DOWN with timeout errors, but the exporter is running.
- The Nextcloud Server Info API can be slow on large instances. Increase the exporter timeout:
--timeout 60s - Increase the Prometheus scrape timeout for the Nextcloud job (must be less than or equal to scrape interval)
- Check Nextcloud server load — the API queries the database, and a heavily loaded server will respond slowly
- Consider caching the Server Info API response if you use a reverse proxy
From Reactive to Proactive: Transforming Nextcloud Administration
Setting up Prometheus and Grafana for Nextcloud monitoring is an investment that pays dividends every day your server runs. With the stack described in this guide, you gain:
- Early warning — Alerts fire when resources approach critical thresholds, not after users are impacted
- Root cause analysis — Correlated metrics pinpoint exactly which resource (CPU, memory, disk, network) is causing performance issues
- Capacity planning — Historical trends show growth patterns, helping you plan upgrades before they become emergencies
- Accountability — Dashboards provide objective SLA metrics you can share with stakeholders
- Faster incident response — When something does go wrong, dashboards immediately show what changed and when
The monitoring architecture is also extensible. As your Nextcloud deployment grows — adding more users, enabling more apps, or scaling to a multi-node architecture — the monitoring stack scales with it. Add more exporters, create more dashboards, and refine your alerting rules as you learn what matters most for your specific environment.
Start with the basics: get the Nextcloud Exporter and Node Exporter running, configure a handful of critical alerts, and build a single overview dashboard. From there, iterate based on the incidents and questions that arise. Within a few weeks, you will wonder how you ever managed Nextcloud without it.