Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Backup and Recovery

Audience: Lab Operator Prerequisites: Daily Operations Time: ~25 minutes

Overview

MADSci stores critical data across multiple databases. Regular backups protect against data loss from hardware failures, software bugs, or accidental deletion. This guide covers backup strategies for all MADSci data stores.

Data Stores

DatabaseUsed ByData Stored
MongoDBEvent, Experiment, Data, Workcell, Location ManagersEvents, experiments, datapoints, workflows, locations
PostgreSQLResource ManagerResources, resource history, templates
MinIOData ManagerFile datapoints (CSVs, images, etc.)
RedisWorkcell ManagerWorkflow queue, node locks (ephemeral)

Note: Redis data is ephemeral (workflow queue state). It does not require backup; the workcell recovers from MongoDB on restart.

Backup Tools

MADSci provides unified backup tools in the madsci_common package.

PostgreSQL Backups

from madsci.common.backup_tools import PostgreSQLBackupTool
from madsci.common.types.backup_types import PostgreSQLBackupSettings

settings = PostgreSQLBackupSettings(
    db_url="postgresql://postgres:postgres@localhost:5432/resources",
    backup_dir=Path("./backups/postgres"),
    max_backups=10,
    validate_integrity=True,
)

backup_tool = PostgreSQLBackupTool(settings)

# Create a backup
backup_path = backup_tool.create_backup("daily")
print(f"Backup created: {backup_path}")

# List available backups
backups = backup_tool.list_available_backups()
for b in backups:
    print(f"  {b.backup_path} ({b.backup_size} bytes, {b.created_at})")

# Validate a backup
is_valid = backup_tool.validate_backup_integrity(backup_path)
print(f"Backup valid: {is_valid}")

# Restore from backup
backup_tool.restore_from_backup(backup_path)

MongoDB Backups

from madsci.common.backup_tools import MongoDBBackupTool
from madsci.common.types.backup_types import MongoDBBackupSettings
from pydantic import AnyUrl

settings = MongoDBBackupSettings(
    mongo_db_url=AnyUrl("mongodb://localhost:27017/"),
    database="events",
    backup_dir=Path("./backups/mongodb"),
    max_backups=10,
)

backup_tool = MongoDBBackupTool(settings)

# Create a backup
backup_path = backup_tool.create_backup("daily")

# List backups
backups = backup_tool.list_available_backups()

# Restore
backup_tool.restore_from_backup(backup_path)

CLI Backup Commands

# PostgreSQL backup
madsci-postgres-backup create \
  --db-url postgresql://postgres:postgres@localhost:5432/resources \
  --backup-dir ./backups/postgres

# MongoDB backup
madsci-mongodb-backup create \
  --mongo-url mongodb://localhost:27017 \
  --database events \
  --backup-dir ./backups/mongodb

# Unified CLI (auto-detects database type)
madsci-backup create --db-url postgresql://localhost:5432/resources
madsci-backup create --db-url mongodb://localhost:27017/events

# List backups
madsci-backup list --backup-dir ./backups/postgres

# Restore
madsci-backup restore \
  --db-url postgresql://localhost:5432/resources \
  --backup-path ./backups/postgres/backup_20260209_daily.sql

Backup Strategy

DatabaseFrequencyRetentionNotes
PostgreSQL (Resources)Every 6 hours30 daysCritical - contains resource state
MongoDB (Events)Daily90 daysLarge volume, archivable
MongoDB (Experiments)Every 6 hours90 daysContains experiment results
MongoDB (Data)Daily90 daysMetadata only; files in MinIO
MongoDB (Workcell)Daily30 daysWorkflow history
MongoDB (Locations)Daily30 daysLocation configuration
MinIO (Files)Daily90 daysUse MinIO’s built-in replication

Automated Backups with Cron

# /etc/cron.d/madsci-backups

# PostgreSQL - every 6 hours
0 */6 * * * madsci-postgres-backup create \
  --db-url postgresql://postgres:postgres@localhost:5432/resources \
  --backup-dir /data/backups/postgres \
  --max-backups 120

# MongoDB events - daily at 2 AM
0 2 * * * madsci-mongodb-backup create \
  --mongo-url mongodb://localhost:27017 \
  --database events \
  --backup-dir /data/backups/mongodb/events \
  --max-backups 90

# MongoDB experiments - every 6 hours
0 */6 * * * madsci-mongodb-backup create \
  --mongo-url mongodb://localhost:27017 \
  --database experiments \
  --backup-dir /data/backups/mongodb/experiments \
  --max-backups 120

Backup Validation

Always validate backups periodically:

from madsci.common.backup_tools import BackupValidator

validator = BackupValidator()

# Validate a specific backup
result = validator.validate(backup_path)
print(f"Valid: {result.is_valid}")
print(f"Checksum: {result.checksum}")
print(f"Size: {result.backup_size} bytes")

Recovery Procedures

Scenario 1: Single Database Corruption

# 1. Stop the affected manager
docker compose stop resource_manager

# 2. Restore from latest backup
madsci-postgres-backup restore \
  --db-url postgresql://postgres:postgres@localhost:5432/resources \
  --backup-path /data/backups/postgres/latest.sql

# 3. Restart the manager
docker compose start resource_manager

# 4. Verify health
curl http://localhost:8003/health

Scenario 2: Full System Recovery

# 1. Start infrastructure only
docker compose up -d mongodb postgres redis minio

# 2. Wait for databases to be ready
sleep 10

# 3. Restore PostgreSQL
madsci-postgres-backup restore \
  --db-url postgresql://postgres:postgres@localhost:5432/resources \
  --backup-path /data/backups/postgres/latest.sql

# 4. Restore MongoDB databases
for db in events experiments data workcell locations; do
  madsci-mongodb-backup restore \
    --mongo-url mongodb://localhost:27017 \
    --database $db \
    --backup-path /data/backups/mongodb/$db/latest.bson
done

# 5. Start managers
docker compose up -d

# 6. Verify all services
madsci status

Scenario 3: Pre-Migration Backup

Before running database migrations, always create a backup:

# The migration tool creates automatic backups
python -m madsci.resource_manager.migration_tool \
  --db-url postgresql://postgres:postgres@localhost:5432/resources

# If migration fails, it auto-restores from the backup

MinIO Backup

MinIO stores file datapoints (CSVs, images, raw instrument data). Use MinIO’s built-in tools:

# Install MinIO client
mc alias set myminio http://localhost:9000 minioadmin minioadmin

# Mirror to backup location
mc mirror myminio/madsci-data /data/backups/minio/

# Or use mc cp for specific buckets
mc cp --recursive myminio/madsci-data/ /data/backups/minio/madsci-data/

Docker Volume Backups

For simple setups, you can back up Docker volumes directly:

# List volumes
docker volume ls | grep madsci

# Backup a volume
docker run --rm \
  -v madsci_mongodb_data:/data \
  -v $(pwd)/backups:/backup \
  alpine tar czf /backup/mongodb_data.tar.gz /data

# Restore a volume
docker run --rm \
  -v madsci_mongodb_data:/data \
  -v $(pwd)/backups:/backup \
  alpine tar xzf /backup/mongodb_data.tar.gz -C /

Backup Checklist

What’s Next?