Ceph Replication Is Not a Backup
If you're running your Ubuntu VPS on MassiveGRID's infrastructure, your data already sits on a Ceph cluster with 3x replication across NVMe drives. That means if a physical disk fails, your data survives automatically. No intervention needed, no downtime, no data loss from hardware failure.
But here's the crucial distinction that catches people off guard: replication protects against hardware failure. It does not protect against you.
If you accidentally run rm -rf /var/www, Ceph faithfully replicates that deletion across all three copies. If ransomware encrypts your database, Ceph replicates the encrypted version. If a buggy deployment corrupts your application state, that corruption is replicated too.
Replication and backups solve fundamentally different problems:
| Threat | Ceph 3x Replication | Backups |
|---|---|---|
| Disk failure | Protected | Protected |
| Node failure | Protected | Protected |
| Accidental deletion | Not protected | Protected |
| Application bugs / corruption | Not protected | Protected |
| Ransomware | Not protected | Protected (if offsite) |
| Bad deployment / config change | Not protected | Protected |
| Datacenter-level disaster | Not protected | Protected (if offsite) |
You need both. Ceph handles the infrastructure layer. Backups handle the human and software layer. This guide covers setting up the backup side.
The 3-2-1 Backup Rule
The 3-2-1 rule has survived decades of IT evolution because it works:
- 3 copies of your data (the original plus two backups)
- 2 different storage media or locations (your VPS plus an external destination)
- 1 copy offsite (physically separate from your primary infrastructure)
For a VPS setup, this translates practically to: your live data on the VPS, a local backup on the same VPS (for fast restores), and an offsite copy on S3-compatible storage or a separate server.
Ceph's three replicas technically give you three copies at the infrastructure level, but they all respond to the same logical writes. From a backup perspective, they count as one copy.
What to Back Up
Before writing scripts, define exactly what needs backing up. Missing a critical path is worse than having no backup at all because it creates false confidence.
Application files
Your web root (/var/www), application code, uploaded media, and any data stored on the filesystem. If your app writes user-generated content to disk, that directory is critical.
Databases
MySQL/MariaDB databases, PostgreSQL databases, Redis dumps if you use persistence, MongoDB data directories. Database files should never be backed up by copying raw files while the database is running. Always use dump tools.
Configuration files
These are easy to forget but painful to recreate:
/etc/nginx/or/etc/apache2/— web server configs/etc/letsencrypt/— SSL certificates and renewal configs/etc/ssh/sshd_config— SSH configuration/etc/systemd/system/— custom service files/etc/ufw/or/etc/iptables/— firewall rules/etc/fail2ban/— intrusion prevention configs- Application-specific configs (
.envfiles,docker-compose.yml, etc.)
Cron jobs
# Export all cron jobs for the current user
crontab -l > /backup/crontabs/$(whoami).cron
# Export system-wide cron
cp -r /etc/cron.d/ /backup/crontabs/system/
Package list
Capture your installed packages so you can rebuild the server from scratch if needed:
# Save installed packages
dpkg --get-selections > /backup/meta/packages.list
# Save apt sources
cp /etc/apt/sources.list /backup/meta/
cp -r /etc/apt/sources.list.d/ /backup/meta/
File Backups with rsync
rsync is the standard tool for file-level backups on Linux. It transfers only changed blocks, supports compression, and handles interrupted transfers gracefully. Here's a practical backup script:
#!/bin/bash
# /usr/local/bin/backup-files.sh
# File backup script using rsync with rotation
set -euo pipefail
# Configuration
BACKUP_BASE="/backup/files"
TIMESTAMP=$(date +%Y-%m-%d_%H%M%S)
CURRENT="${BACKUP_BASE}/current"
ARCHIVE="${BACKUP_BASE}/archive/${TIMESTAMP}"
LOG="/var/log/backup-files.log"
# Directories to back up
SOURCES=(
"/var/www"
"/etc/nginx"
"/etc/letsencrypt"
"/etc/ssh/sshd_config"
"/etc/systemd/system"
"/etc/ufw"
"/home"
)
# Exclusions
EXCLUDES=(
"--exclude=node_modules"
"--exclude=.git"
"--exclude=*.log"
"--exclude=*.tmp"
"--exclude=cache/"
"--exclude=.cache/"
)
echo "[$(date)] Starting file backup..." | tee -a "$LOG"
# Create directories
mkdir -p "$CURRENT" "$ARCHIVE"
# Rsync each source with hard-link rotation
for src in "${SOURCES[@]}"; do
if [ -e "$src" ]; then
dest_dir="${CURRENT}$(dirname "$src")"
mkdir -p "$dest_dir"
rsync -aAX --delete \
"${EXCLUDES[@]}" \
--link-dest="$CURRENT" \
"$src" "$dest_dir/" 2>&1 | tee -a "$LOG"
else
echo "[$(date)] SKIP: $src does not exist" | tee -a "$LOG"
fi
done
# Create timestamped snapshot using hard links (space-efficient)
cp -al "$CURRENT" "$ARCHIVE"
echo "[$(date)] File backup completed: $ARCHIVE" | tee -a "$LOG"
The --link-dest flag is key here. It tells rsync to hard-link unchanged files from the previous backup instead of copying them again. This means each "full" backup only uses disk space proportional to what changed, while still appearing as a complete directory tree you can browse and restore from directly.
Database Backups
MySQL / MariaDB
#!/bin/bash
# /usr/local/bin/backup-databases.sh
set -euo pipefail
BACKUP_DIR="/backup/databases"
TIMESTAMP=$(date +%Y-%m-%d_%H%M%S)
LOG="/var/log/backup-db.log"
mkdir -p "${BACKUP_DIR}/mysql" "${BACKUP_DIR}/postgres"
echo "[$(date)] Starting database backups..." | tee -a "$LOG"
# MySQL: dump each database separately
for db in $(mysql -e "SHOW DATABASES;" -sN | grep -Ev "^(information_schema|performance_schema|sys)$"); do
mysqldump \
--single-transaction \
--routines \
--triggers \
--events \
--quick \
"$db" | gzip > "${BACKUP_DIR}/mysql/${db}_${TIMESTAMP}.sql.gz"
echo "[$(date)] MySQL: $db backed up" | tee -a "$LOG"
done
The --single-transaction flag is essential. It takes a consistent snapshot using a transaction, meaning your backup is point-in-time consistent without locking tables. This works with InnoDB tables, which are the default in modern MySQL and MariaDB.
PostgreSQL
# PostgreSQL: dump each database separately
for db in $(psql -U postgres -t -c "SELECT datname FROM pg_database WHERE datistemplate = false AND datname != 'postgres';"); do
db=$(echo "$db" | xargs) # trim whitespace
pg_dump -U postgres -Fc "$db" > "${BACKUP_DIR}/postgres/${db}_${TIMESTAMP}.dump"
echo "[$(date)] PostgreSQL: $db backed up" | tee -a "$LOG"
done
echo "[$(date)] Database backups completed" | tee -a "$LOG"
For PostgreSQL, the -Fc flag creates a custom-format dump that supports selective restore (individual tables), compression, and parallel restore. It's generally preferred over plain SQL dumps for production use.
Automating with Cron
Manual backups are worthless because they stop happening the moment you get busy. Cron makes them automatic:
# Edit crontab
crontab -e
# Add these entries:
# File backup - daily at 2:00 AM
0 2 * * * /usr/local/bin/backup-files.sh >> /var/log/backup-files.log 2>&1
# Database backup - every 6 hours
0 */6 * * * /usr/local/bin/backup-databases.sh >> /var/log/backup-db.log 2>&1
# Offsite sync - daily at 4:00 AM (after local backups complete)
0 4 * * * /usr/local/bin/backup-offsite.sh >> /var/log/backup-offsite.log 2>&1
# Backup rotation/cleanup - weekly on Sunday at 5:00 AM
0 5 * * 0 /usr/local/bin/backup-rotate.sh >> /var/log/backup-rotate.log 2>&1
Make sure your scripts are executable:
chmod +x /usr/local/bin/backup-*.sh
A common mistake is scheduling backups during peak traffic hours. If your VPS serves users primarily in one timezone, schedule backups during the quietest window. On a Dedicated VPS with guaranteed resources, the performance impact is minimal, but it's still good practice.
Offsite Backups with rclone
rclone is the Swiss Army knife for cloud storage. It supports S3-compatible storage, Backblaze B2, Google Cloud Storage, and dozens of other backends. Here's how to set it up:
Install rclone
sudo apt update && sudo apt install -y rclone
Configure a remote
# Interactive setup
rclone config
# Or configure directly for S3-compatible storage:
cat >> ~/.config/rclone/rclone.conf << EOF
[offsite]
type = s3
provider = Other
env_auth = false
access_key_id = YOUR_ACCESS_KEY
secret_access_key = YOUR_SECRET_KEY
endpoint = https://s3.example.com
acl = private
EOF
Offsite sync script
#!/bin/bash
# /usr/local/bin/backup-offsite.sh
set -euo pipefail
LOG="/var/log/backup-offsite.log"
REMOTE="offsite:my-vps-backups"
echo "[$(date)] Starting offsite sync..." | tee -a "$LOG"
# Sync local backups to remote storage
rclone sync /backup "$REMOTE" \
--transfers 4 \
--checkers 8 \
--fast-list \
--log-file="$LOG" \
--log-level INFO \
--stats 30s
echo "[$(date)] Offsite sync completed" | tee -a "$LOG"
The rclone sync command makes the remote match the local directory exactly. Files deleted locally will also be deleted remotely. If you want to keep remote copies even after local rotation, use rclone copy instead.
Rotation and Retention Policies
Without rotation, backups consume disk space until your VPS runs out. A practical retention policy:
- 7 daily backups — covers the past week for quick recovery from recent issues
- 4 weekly backups — covers the past month for longer-term recovery
- 3 monthly backups — covers the past quarter for compliance or historical recovery
#!/bin/bash
# /usr/local/bin/backup-rotate.sh
set -euo pipefail
BACKUP_DIR="/backup"
LOG="/var/log/backup-rotate.log"
echo "[$(date)] Starting backup rotation..." | tee -a "$LOG"
# Remove daily backups older than 7 days
find "${BACKUP_DIR}/files/archive" -maxdepth 1 -type d -mtime +7 \
-not -name "$(date -d 'last sunday' +%Y-%m-%d)*" \
-exec rm -rf {} + 2>/dev/null
# Remove database backups older than 7 days (keep weekly)
find "${BACKUP_DIR}/databases" -name "*.gz" -mtime +7 \
-not -newermt "$(date -d '4 weeks ago' +%Y-%m-%d)" \
-exec rm -f {} + 2>/dev/null
find "${BACKUP_DIR}/databases" -name "*.dump" -mtime +7 \
-not -newermt "$(date -d '4 weeks ago' +%Y-%m-%d)" \
-exec rm -f {} + 2>/dev/null
# Remove weekly backups older than 30 days (keep monthly)
find "${BACKUP_DIR}/files/archive" -maxdepth 1 -type d -mtime +30 \
-not -name "*-01_*" \
-exec rm -rf {} + 2>/dev/null
# Remove monthly backups older than 90 days
find "${BACKUP_DIR}/files/archive" -maxdepth 1 -type d -mtime +90 \
-exec rm -rf {} + 2>/dev/null
# Report disk usage
echo "[$(date)] Backup disk usage:" | tee -a "$LOG"
du -sh "${BACKUP_DIR}"/* 2>/dev/null | tee -a "$LOG"
echo "[$(date)] Rotation completed" | tee -a "$LOG"
Adjust these windows based on your recovery needs. E-commerce sites or applications with financial data might need 30 daily backups and 12 monthly backups for audit compliance.
Testing Restores: The Most Neglected Step
A backup you've never tested is a backup that might not work. The most common failures:
- Permissions not preserved — rsync needs
-aAXflags to capture ACLs and extended attributes - Database dump incomplete — large databases can time out or run out of memory during dump
- Missing dependencies — your backup includes the app but not the runtime or libraries
- Encryption keys not backed up — encrypted backups are useless without the key
Schedule a quarterly restore test. The safest approach: spin up a second VPS on MassiveGRID, restore your backups to it, and verify everything works. The cost of running a test instance for a few hours is negligible compared to discovering your backups are broken during an actual emergency.
Quick restore verification script
#!/bin/bash
# /usr/local/bin/backup-verify.sh
# Run manually to verify backup integrity
set -euo pipefail
echo "=== Backup Verification ==="
echo ""
# Check backup freshness
echo "--- Latest backups ---"
LATEST_FILE=$(ls -td /backup/files/archive/*/ 2>/dev/null | head -1)
LATEST_DB=$(ls -t /backup/databases/mysql/*.gz 2>/dev/null | head -1)
if [ -n "$LATEST_FILE" ]; then
echo "Files: $LATEST_FILE ($(du -sh "$LATEST_FILE" | cut -f1))"
else
echo "WARNING: No file backups found!"
fi
if [ -n "$LATEST_DB" ]; then
echo "Database: $LATEST_DB ($(du -sh "$LATEST_DB" | cut -f1))"
# Verify gzip integrity
if gzip -t "$LATEST_DB" 2>/dev/null; then
echo " Integrity: OK"
else
echo " Integrity: FAILED - file is corrupt!"
fi
else
echo "WARNING: No database backups found!"
fi
# Check offsite sync status
echo ""
echo "--- Offsite status ---"
rclone size offsite:my-vps-backups 2>/dev/null || echo "WARNING: Cannot reach offsite storage"
echo ""
echo "=== Verification complete ==="
Disaster Recovery Checklist
When the worst happens, you need a clear playbook, not a panicked search through documentation. Print this, bookmark it, or save it somewhere accessible outside your VPS:
- Deploy a new VPS — provision a fresh Ubuntu 24.04 VPS from MassiveGRID (takes under 60 seconds)
- Install base packages — restore from your saved package list:
dpkg --set-selections < packages.list && apt-get dselect-upgrade - Restore configuration files — copy back
/etc/nginx, SSH keys, firewall rules, SSL certificates - Restore application files — rsync from your backup:
rsync -aAX /backup/files/current/var/www/ /var/www/ - Restore databases —
gunzip < database_backup.sql.gz | mysql database_nameorpg_restore -d database_name backup.dump - Restore cron jobs —
crontab restored_crontab.cron - Verify services — check that nginx, your application, and database are running
- Test functionality — hit critical endpoints, verify database connectivity, check SSL
- Update DNS — point your domain to the new VPS IP
- Re-enable backups — make sure backup scripts are running on the new server
For mission-critical applications where even an hour of downtime is unacceptable, consider MassiveGRID's Managed Dedicated Cloud Servers. The managed team handles backup orchestration, monitoring, and disaster recovery as part of the service, so you can focus on your application rather than infrastructure operations.
Putting It All Together
A complete backup strategy for your Ubuntu VPS combines multiple layers:
| Layer | Tool | Frequency | Retention |
|---|---|---|---|
| Infrastructure replication | Ceph 3x (automatic) | Real-time | Continuous |
| File backups | rsync + hard links | Daily | 7 daily, 4 weekly, 3 monthly |
| Database backups | mysqldump / pg_dump | Every 6 hours | 7 daily, 4 weekly, 3 monthly |
| Offsite copies | rclone to S3 | Daily | Matches local policy |
| Restore testing | Manual verification | Quarterly | N/A |
The scripts in this guide give you a solid foundation. Adapt the paths, schedules, and retention windows to match your specific application. The most important thing is to start: an imperfect backup that runs every day is infinitely better than a perfect backup plan that never gets implemented.
MassiveGRID Ubuntu VPS includes: Ubuntu 24.04 LTS pre-installed · Proxmox HA cluster with automatic failover · Ceph 3x replicated NVMe storage · Independent CPU/RAM/storage scaling · 12 Tbps DDoS protection · 4 global datacenter locations · 100% uptime SLA · 24/7 human support rated 9.5/10
→ Deploy a self-managed VPS — from $1.99/mo
→ Need dedicated resources? — from $8.30/mo
→ Want fully managed hosting? — we handle everything