Ceph Replication Is Not a Backup

If you're running your Ubuntu VPS on MassiveGRID's infrastructure, your data already sits on a Ceph cluster with 3x replication across NVMe drives. That means if a physical disk fails, your data survives automatically. No intervention needed, no downtime, no data loss from hardware failure.

But here's the crucial distinction that catches people off guard: replication protects against hardware failure. It does not protect against you.

If you accidentally run rm -rf /var/www, Ceph faithfully replicates that deletion across all three copies. If ransomware encrypts your database, Ceph replicates the encrypted version. If a buggy deployment corrupts your application state, that corruption is replicated too.

Replication and backups solve fundamentally different problems:

ThreatCeph 3x ReplicationBackups
Disk failureProtectedProtected
Node failureProtectedProtected
Accidental deletionNot protectedProtected
Application bugs / corruptionNot protectedProtected
RansomwareNot protectedProtected (if offsite)
Bad deployment / config changeNot protectedProtected
Datacenter-level disasterNot protectedProtected (if offsite)

You need both. Ceph handles the infrastructure layer. Backups handle the human and software layer. This guide covers setting up the backup side.

The 3-2-1 Backup Rule

The 3-2-1 rule has survived decades of IT evolution because it works:

For a VPS setup, this translates practically to: your live data on the VPS, a local backup on the same VPS (for fast restores), and an offsite copy on S3-compatible storage or a separate server.

Ceph's three replicas technically give you three copies at the infrastructure level, but they all respond to the same logical writes. From a backup perspective, they count as one copy.

What to Back Up

Before writing scripts, define exactly what needs backing up. Missing a critical path is worse than having no backup at all because it creates false confidence.

Application files

Your web root (/var/www), application code, uploaded media, and any data stored on the filesystem. If your app writes user-generated content to disk, that directory is critical.

Databases

MySQL/MariaDB databases, PostgreSQL databases, Redis dumps if you use persistence, MongoDB data directories. Database files should never be backed up by copying raw files while the database is running. Always use dump tools.

Configuration files

These are easy to forget but painful to recreate:

Cron jobs

# Export all cron jobs for the current user
crontab -l > /backup/crontabs/$(whoami).cron

# Export system-wide cron
cp -r /etc/cron.d/ /backup/crontabs/system/

Package list

Capture your installed packages so you can rebuild the server from scratch if needed:

# Save installed packages
dpkg --get-selections > /backup/meta/packages.list

# Save apt sources
cp /etc/apt/sources.list /backup/meta/
cp -r /etc/apt/sources.list.d/ /backup/meta/

File Backups with rsync

rsync is the standard tool for file-level backups on Linux. It transfers only changed blocks, supports compression, and handles interrupted transfers gracefully. Here's a practical backup script:

#!/bin/bash
# /usr/local/bin/backup-files.sh
# File backup script using rsync with rotation

set -euo pipefail

# Configuration
BACKUP_BASE="/backup/files"
TIMESTAMP=$(date +%Y-%m-%d_%H%M%S)
CURRENT="${BACKUP_BASE}/current"
ARCHIVE="${BACKUP_BASE}/archive/${TIMESTAMP}"
LOG="/var/log/backup-files.log"

# Directories to back up
SOURCES=(
    "/var/www"
    "/etc/nginx"
    "/etc/letsencrypt"
    "/etc/ssh/sshd_config"
    "/etc/systemd/system"
    "/etc/ufw"
    "/home"
)

# Exclusions
EXCLUDES=(
    "--exclude=node_modules"
    "--exclude=.git"
    "--exclude=*.log"
    "--exclude=*.tmp"
    "--exclude=cache/"
    "--exclude=.cache/"
)

echo "[$(date)] Starting file backup..." | tee -a "$LOG"

# Create directories
mkdir -p "$CURRENT" "$ARCHIVE"

# Rsync each source with hard-link rotation
for src in "${SOURCES[@]}"; do
    if [ -e "$src" ]; then
        dest_dir="${CURRENT}$(dirname "$src")"
        mkdir -p "$dest_dir"
        rsync -aAX --delete \
            "${EXCLUDES[@]}" \
            --link-dest="$CURRENT" \
            "$src" "$dest_dir/" 2>&1 | tee -a "$LOG"
    else
        echo "[$(date)] SKIP: $src does not exist" | tee -a "$LOG"
    fi
done

# Create timestamped snapshot using hard links (space-efficient)
cp -al "$CURRENT" "$ARCHIVE"

echo "[$(date)] File backup completed: $ARCHIVE" | tee -a "$LOG"

The --link-dest flag is key here. It tells rsync to hard-link unchanged files from the previous backup instead of copying them again. This means each "full" backup only uses disk space proportional to what changed, while still appearing as a complete directory tree you can browse and restore from directly.

Database Backups

MySQL / MariaDB

#!/bin/bash
# /usr/local/bin/backup-databases.sh

set -euo pipefail

BACKUP_DIR="/backup/databases"
TIMESTAMP=$(date +%Y-%m-%d_%H%M%S)
LOG="/var/log/backup-db.log"

mkdir -p "${BACKUP_DIR}/mysql" "${BACKUP_DIR}/postgres"

echo "[$(date)] Starting database backups..." | tee -a "$LOG"

# MySQL: dump each database separately
for db in $(mysql -e "SHOW DATABASES;" -sN | grep -Ev "^(information_schema|performance_schema|sys)$"); do
    mysqldump \
        --single-transaction \
        --routines \
        --triggers \
        --events \
        --quick \
        "$db" | gzip > "${BACKUP_DIR}/mysql/${db}_${TIMESTAMP}.sql.gz"
    echo "[$(date)] MySQL: $db backed up" | tee -a "$LOG"
done

The --single-transaction flag is essential. It takes a consistent snapshot using a transaction, meaning your backup is point-in-time consistent without locking tables. This works with InnoDB tables, which are the default in modern MySQL and MariaDB.

PostgreSQL

# PostgreSQL: dump each database separately
for db in $(psql -U postgres -t -c "SELECT datname FROM pg_database WHERE datistemplate = false AND datname != 'postgres';"); do
    db=$(echo "$db" | xargs)  # trim whitespace
    pg_dump -U postgres -Fc "$db" > "${BACKUP_DIR}/postgres/${db}_${TIMESTAMP}.dump"
    echo "[$(date)] PostgreSQL: $db backed up" | tee -a "$LOG"
done

echo "[$(date)] Database backups completed" | tee -a "$LOG"

For PostgreSQL, the -Fc flag creates a custom-format dump that supports selective restore (individual tables), compression, and parallel restore. It's generally preferred over plain SQL dumps for production use.

Automating with Cron

Manual backups are worthless because they stop happening the moment you get busy. Cron makes them automatic:

# Edit crontab
crontab -e

# Add these entries:

# File backup - daily at 2:00 AM
0 2 * * * /usr/local/bin/backup-files.sh >> /var/log/backup-files.log 2>&1

# Database backup - every 6 hours
0 */6 * * * /usr/local/bin/backup-databases.sh >> /var/log/backup-db.log 2>&1

# Offsite sync - daily at 4:00 AM (after local backups complete)
0 4 * * * /usr/local/bin/backup-offsite.sh >> /var/log/backup-offsite.log 2>&1

# Backup rotation/cleanup - weekly on Sunday at 5:00 AM
0 5 * * 0 /usr/local/bin/backup-rotate.sh >> /var/log/backup-rotate.log 2>&1

Make sure your scripts are executable:

chmod +x /usr/local/bin/backup-*.sh

A common mistake is scheduling backups during peak traffic hours. If your VPS serves users primarily in one timezone, schedule backups during the quietest window. On a Dedicated VPS with guaranteed resources, the performance impact is minimal, but it's still good practice.

Offsite Backups with rclone

rclone is the Swiss Army knife for cloud storage. It supports S3-compatible storage, Backblaze B2, Google Cloud Storage, and dozens of other backends. Here's how to set it up:

Install rclone

sudo apt update && sudo apt install -y rclone

Configure a remote

# Interactive setup
rclone config

# Or configure directly for S3-compatible storage:
cat >> ~/.config/rclone/rclone.conf << EOF
[offsite]
type = s3
provider = Other
env_auth = false
access_key_id = YOUR_ACCESS_KEY
secret_access_key = YOUR_SECRET_KEY
endpoint = https://s3.example.com
acl = private
EOF

Offsite sync script

#!/bin/bash
# /usr/local/bin/backup-offsite.sh

set -euo pipefail

LOG="/var/log/backup-offsite.log"
REMOTE="offsite:my-vps-backups"

echo "[$(date)] Starting offsite sync..." | tee -a "$LOG"

# Sync local backups to remote storage
rclone sync /backup "$REMOTE" \
    --transfers 4 \
    --checkers 8 \
    --fast-list \
    --log-file="$LOG" \
    --log-level INFO \
    --stats 30s

echo "[$(date)] Offsite sync completed" | tee -a "$LOG"

The rclone sync command makes the remote match the local directory exactly. Files deleted locally will also be deleted remotely. If you want to keep remote copies even after local rotation, use rclone copy instead.

Rotation and Retention Policies

Without rotation, backups consume disk space until your VPS runs out. A practical retention policy:

#!/bin/bash
# /usr/local/bin/backup-rotate.sh

set -euo pipefail

BACKUP_DIR="/backup"
LOG="/var/log/backup-rotate.log"

echo "[$(date)] Starting backup rotation..." | tee -a "$LOG"

# Remove daily backups older than 7 days
find "${BACKUP_DIR}/files/archive" -maxdepth 1 -type d -mtime +7 \
    -not -name "$(date -d 'last sunday' +%Y-%m-%d)*" \
    -exec rm -rf {} + 2>/dev/null

# Remove database backups older than 7 days (keep weekly)
find "${BACKUP_DIR}/databases" -name "*.gz" -mtime +7 \
    -not -newermt "$(date -d '4 weeks ago' +%Y-%m-%d)" \
    -exec rm -f {} + 2>/dev/null
find "${BACKUP_DIR}/databases" -name "*.dump" -mtime +7 \
    -not -newermt "$(date -d '4 weeks ago' +%Y-%m-%d)" \
    -exec rm -f {} + 2>/dev/null

# Remove weekly backups older than 30 days (keep monthly)
find "${BACKUP_DIR}/files/archive" -maxdepth 1 -type d -mtime +30 \
    -not -name "*-01_*" \
    -exec rm -rf {} + 2>/dev/null

# Remove monthly backups older than 90 days
find "${BACKUP_DIR}/files/archive" -maxdepth 1 -type d -mtime +90 \
    -exec rm -rf {} + 2>/dev/null

# Report disk usage
echo "[$(date)] Backup disk usage:" | tee -a "$LOG"
du -sh "${BACKUP_DIR}"/* 2>/dev/null | tee -a "$LOG"

echo "[$(date)] Rotation completed" | tee -a "$LOG"

Adjust these windows based on your recovery needs. E-commerce sites or applications with financial data might need 30 daily backups and 12 monthly backups for audit compliance.

Testing Restores: The Most Neglected Step

A backup you've never tested is a backup that might not work. The most common failures:

Schedule a quarterly restore test. The safest approach: spin up a second VPS on MassiveGRID, restore your backups to it, and verify everything works. The cost of running a test instance for a few hours is negligible compared to discovering your backups are broken during an actual emergency.

Quick restore verification script

#!/bin/bash
# /usr/local/bin/backup-verify.sh
# Run manually to verify backup integrity

set -euo pipefail

echo "=== Backup Verification ==="
echo ""

# Check backup freshness
echo "--- Latest backups ---"
LATEST_FILE=$(ls -td /backup/files/archive/*/ 2>/dev/null | head -1)
LATEST_DB=$(ls -t /backup/databases/mysql/*.gz 2>/dev/null | head -1)

if [ -n "$LATEST_FILE" ]; then
    echo "Files: $LATEST_FILE ($(du -sh "$LATEST_FILE" | cut -f1))"
else
    echo "WARNING: No file backups found!"
fi

if [ -n "$LATEST_DB" ]; then
    echo "Database: $LATEST_DB ($(du -sh "$LATEST_DB" | cut -f1))"
    # Verify gzip integrity
    if gzip -t "$LATEST_DB" 2>/dev/null; then
        echo "  Integrity: OK"
    else
        echo "  Integrity: FAILED - file is corrupt!"
    fi
else
    echo "WARNING: No database backups found!"
fi

# Check offsite sync status
echo ""
echo "--- Offsite status ---"
rclone size offsite:my-vps-backups 2>/dev/null || echo "WARNING: Cannot reach offsite storage"

echo ""
echo "=== Verification complete ==="

Disaster Recovery Checklist

When the worst happens, you need a clear playbook, not a panicked search through documentation. Print this, bookmark it, or save it somewhere accessible outside your VPS:

  1. Deploy a new VPSprovision a fresh Ubuntu 24.04 VPS from MassiveGRID (takes under 60 seconds)
  2. Install base packages — restore from your saved package list: dpkg --set-selections < packages.list && apt-get dselect-upgrade
  3. Restore configuration files — copy back /etc/nginx, SSH keys, firewall rules, SSL certificates
  4. Restore application files — rsync from your backup: rsync -aAX /backup/files/current/var/www/ /var/www/
  5. Restore databasesgunzip < database_backup.sql.gz | mysql database_name or pg_restore -d database_name backup.dump
  6. Restore cron jobscrontab restored_crontab.cron
  7. Verify services — check that nginx, your application, and database are running
  8. Test functionality — hit critical endpoints, verify database connectivity, check SSL
  9. Update DNS — point your domain to the new VPS IP
  10. Re-enable backups — make sure backup scripts are running on the new server

For mission-critical applications where even an hour of downtime is unacceptable, consider MassiveGRID's Managed Dedicated Cloud Servers. The managed team handles backup orchestration, monitoring, and disaster recovery as part of the service, so you can focus on your application rather than infrastructure operations.

Putting It All Together

A complete backup strategy for your Ubuntu VPS combines multiple layers:

LayerToolFrequencyRetention
Infrastructure replicationCeph 3x (automatic)Real-timeContinuous
File backupsrsync + hard linksDaily7 daily, 4 weekly, 3 monthly
Database backupsmysqldump / pg_dumpEvery 6 hours7 daily, 4 weekly, 3 monthly
Offsite copiesrclone to S3DailyMatches local policy
Restore testingManual verificationQuarterlyN/A

The scripts in this guide give you a solid foundation. Adapt the paths, schedules, and retention windows to match your specific application. The most important thing is to start: an imperfect backup that runs every day is infinitely better than a perfect backup plan that never gets implemented.

MassiveGRID Ubuntu VPS includes: Ubuntu 24.04 LTS pre-installed · Proxmox HA cluster with automatic failover · Ceph 3x replicated NVMe storage · Independent CPU/RAM/storage scaling · 12 Tbps DDoS protection · 4 global datacenter locations · 100% uptime SLA · 24/7 human support rated 9.5/10

Deploy a self-managed VPS — from $1.99/mo
Need dedicated resources? — from $8.30/mo
Want fully managed hosting? — we handle everything