Audience: Lab Operator Prerequisites: Tutorial: Full Lab Time: ~20 minutes
Overview¶
This guide covers the day-to-day operations of running a MADSci-powered self-driving laboratory. You’ll learn how to start and stop services, verify system health, and handle common operational tasks.
Starting the Lab¶
Full Lab Startup¶
# Start all services in the background
madsci start -d
# Verify everything is running
madsci status
# Or start in the foreground (logs streamed to terminal)
madsci startLocal Mode (No Docker)¶
For development or environments without Docker:
# Start all managers in-process with in-memory backends
madsci start --mode=local
# Data is ephemeral and will not persist across restartsFull Lab Startup (Docker, Advanced)¶
# Use docker compose directly for more control
docker compose up -d
# Verify everything is running
docker compose ps
# Check health of all MADSci services
madsci statusStartup Order¶
Docker Compose handles dependency ordering, but the logical startup sequence is:
1. Infrastructure MongoDB, PostgreSQL, Redis, MinIO
↓
2. Managers Event, Experiment, Resource, Data, Location, Workcell
↓
3. Lab Manager Squid (central dashboard)
↓
4. Nodes Instrument nodes (liquidhandler, robotarm, etc.)Starting Individual Services¶
# Start a specific manager
madsci start manager event
# Start a specific node
madsci start node ./path/to/node.py
# Start in the background with PID tracking
madsci start manager event -dStarting Individual Services (Docker, Advanced)¶
# Start only infrastructure
docker compose up -d mongodb redis postgres minio
# Start a specific manager via Docker
docker compose up -d event_manager
# Start a specific node via Docker
docker compose up -d liquidhandler_1
# Start managers without nodes
docker compose up -d lab_manager event_manager experiment_manager \
resource_manager data_manager location_manager workcell_managerStarting Without Docker (Advanced)¶
For running individual services directly:
# Start a manager directly
python -m madsci.event_manager
# Start a node directly
python example_modules/liquidhandler.py
# Start with custom settings
EVENT_SERVER_PORT=8001 EVENT_MONGO_DB_URL=mongodb://localhost:27017 \
python -m madsci.event_managerStopping the Lab¶
Graceful Shutdown¶
# Stop all services (preserves data volumes)
madsci stop
# Stop a specific background manager
madsci stop manager event
# Stop a specific background node
madsci stop node <name>
# Stop and remove images
madsci stop --remove
# Stop and remove volumes (data loss — requires confirmation)
madsci stop --volumesGraceful Shutdown (Docker, Advanced)¶
# Stop all services via docker compose directly
docker compose down
# Stop a specific service
docker compose stop liquidhandler_1
# Stop and remove everything including data volumes (DESTRUCTIVE)
docker compose down -vEmergency Shutdown¶
If services are unresponsive:
# Force stop all containers
docker compose kill
# Force stop a specific container
docker kill <container_name>Health Checks¶
Using the CLI¶
# Quick status check of all services
madsci status
# Watch status continuously (updates every 5 seconds)
madsci status --watch
# JSON output for scripting
madsci status --json
# Check specific service health
curl http://localhost:8000/health # Lab Manager
curl http://localhost:8001/health # Event Manager
curl http://localhost:8005/health # Workcell ManagerService Health Endpoints¶
Every MADSci manager exposes a /health endpoint:
| Service | URL | What It Checks |
|---|---|---|
| Lab Manager | http://localhost:8000/health | Manager connectivity |
| Event Manager | http://localhost:8001/health | MongoDB connection |
| Experiment Manager | http://localhost:8002/health | MongoDB connection |
| Resource Manager | http://localhost:8003/health | PostgreSQL connection |
| Data Manager | http://localhost:8004/health | MongoDB + MinIO connection |
| Workcell Manager | http://localhost:8005/health | MongoDB + Redis + node connectivity |
| Location Manager | http://localhost:8006/health | MongoDB connection |
Node Health¶
# Check a specific node
curl http://localhost:2000/health
# Get node info
curl http://localhost:2000/info
# Get node state
curl http://localhost:2000/stateSystem Diagnostics¶
# Run comprehensive diagnostics
madsci doctor
# Check specific categories
madsci doctor --check python
madsci doctor --check docker
madsci doctor --check portsViewing Logs¶
Using the CLI¶
# View recent logs
madsci logs --tail 50
# Follow logs in real time
madsci logs --follow
# Filter by log level
madsci logs --level ERROR
madsci logs --level WARNING
# Filter by pattern
madsci logs --grep "workflow"
madsci logs --grep "liquidhandler"
# Logs from a specific time period
madsci logs --since 1h
madsci logs --since 30mUsing Docker¶
# All service logs
docker compose logs -f
# Specific service logs
docker compose logs -f workcell_manager
# Last 100 lines
docker compose logs --tail 100 event_manager
# Logs since a timestamp
docker compose logs --since 2026-02-09T10:00:00 workcell_managerUsing the TUI¶
madsci tuiPress l to navigate to the Logs screen. Use the filter controls to narrow down by level or search pattern.
Managing Workflows¶
Check Active Workflows¶
from madsci.client import WorkcellClient
wc = WorkcellClient(workcell_server_url="http://localhost:8005/")
# List active workflows
active = wc.get_active_workflows()
for wf_id, wf in active.items():
print(f"{wf_id}: step {wf.status.current_step_index}, "
f"started {wf.submitted_time}")
# Check the workflow queue
queue = wc.get_workflow_queue()
print(f"{len(queue)} workflows queued")Cancel a Stuck Workflow¶
from madsci.client import WorkcellClient
wc = WorkcellClient(workcell_server_url="http://localhost:8005/")
wc.cancel_workflow("workflow_id_here")Pause and Resume¶
# Pause a running workflow
wc.pause_workflow("workflow_id_here")
# Resume a paused workflow
wc.resume_workflow("workflow_id_here")Managing Nodes¶
Check Node Status¶
from madsci.client import WorkcellClient
wc = WorkcellClient(workcell_server_url="http://localhost:8005/")
nodes = wc.get_nodes()
for name, node in nodes.items():
print(f"{name}: {node.status}")Restart a Node¶
# Restart via Docker
docker compose restart liquidhandler_1
# Or stop and start
docker compose stop liquidhandler_1
docker compose start liquidhandler_1Lock/Unlock a Node¶
Locking prevents the workcell from sending actions to a node (useful during maintenance):
# Lock a node
curl -X POST http://localhost:2000/admin/lock
# Unlock a node
curl -X POST http://localhost:2000/admin/unlockDaily Checklist¶
A recommended daily routine for lab operators:
Morning startup:
madsci start -d && madsci statusVerify health: Check all services show HEALTHY
Review overnight logs:
madsci logs --since 12h --level WARNINGCheck disk space:
df -h(especially for data and log volumes)Verify backups: Check that scheduled backups completed
End of day: Review experiment results, check for errors
What’s Next?¶
Monitoring - Detailed monitoring with TUI and observability tools
Backup & Recovery - Database backup strategies
Troubleshooting - Common issues and solutions