Audience: Equipment Integrator Prerequisites: Wiring the Node, Testing Strategies Time: ~20 minutes
Overview¶
Debugging instrument modules involves isolating issues across multiple layers: hardware communication, interface logic, node framework, and workcell orchestration. This guide covers systematic approaches to identifying and resolving problems at each layer.
Debugging Layers¶
Layer 4: Workcell / Workflow
Layer 3: Node REST API
Layer 2: Interface Logic
Layer 1: Hardware / DriverRule of thumb: Always start debugging at the lowest layer and work upward. If the interface doesn’t work standalone, the node won’t work either.
Layer 1: Hardware and Driver Issues¶
Symptoms¶
Connection timeouts
No response from device
Garbled data
Permission errors
Diagnosis¶
Check physical connection:
# Linux: List serial devices
ls -la /dev/ttyUSB* /dev/ttyACM*
# macOS: List serial devices
ls -la /dev/tty.usb* /dev/cu.usb*
# Check USB devices
lsusb # Linux
system_profiler SPUSBDataType # macOSCheck permissions:
# Linux: Add user to dialout group for serial access
sudo usermod -aG dialout $USER
# Log out and back in for group change to take effect
# Check device permissions
ls -la /dev/ttyUSB0
# Should show: crw-rw---- 1 root dialout ...Test raw communication:
# Quick serial test (independent of MADSci)
import serial
port = serial.Serial("/dev/ttyUSB0", baudrate=9600, timeout=5)
port.write(b"*IDN?\n")
response = port.readline()
print(f"Response: {response}")
port.close()Test network communication:
# Check if device is reachable
ping 192.168.1.100
# Check if port is open
nc -zv 192.168.1.100 5000
# Test HTTP endpoint
curl http://192.168.1.100:5000/api/statusCommon Fixes¶
| Problem | Solution |
|---|---|
Permission denied: /dev/ttyUSB0 | Add user to dialout group (Linux) or check System Preferences > Security (macOS) |
| Device not found | Check cable, try different USB port, verify driver installed |
| Garbled data | Check baud rate, parity, stop bits match device settings |
| Timeout on read | Increase timeout, check device is powered on, verify command format |
Address already in use | Another process has the port open. Use lsof /dev/ttyUSB0 to find it |
Layer 2: Interface Issues¶
Symptoms¶
Interface methods raise unexpected exceptions
Incorrect data returned
State inconsistencies
Diagnosis¶
Test the interface directly in Python:
# Interactive debugging (works in Jupyter too)
from my_sensor_interface import MySensorInterface
iface = MySensorInterface(port="/dev/ttyUSB0", baud_rate=9600)
iface.connect()
# Test individual operations
print(f"Connected: {iface.is_connected()}")
print(f"Reading: {iface.read_sensor()}")
print(f"State: {iface.get_state()}")
iface.disconnect()Compare real vs. fake interface behavior:
from my_sensor_interface import MySensorInterface
from my_sensor_fake_interface import MySensorFakeInterface
# Run the same operations on both
for iface in [MySensorFakeInterface(), MySensorInterface(port="/dev/ttyUSB0")]:
iface.connect()
try:
reading = iface.read_sensor()
print(f"{iface.__class__.__name__}: {reading}")
except Exception as e:
print(f"{iface.__class__.__name__}: ERROR - {e}")
finally:
iface.disconnect()Add debug logging to your interface:
import logging
logger = logging.getLogger(__name__)
class MySensorInterface:
def read_sensor(self, samples=1):
logger.debug(f"Reading sensor with {samples} samples")
raw = self._send_command(f"READ {samples}")
logger.debug(f"Raw response: {raw!r}")
parsed = self._parse_response(raw)
logger.debug(f"Parsed response: {parsed}")
return parsedRun with debug logging:
PYTHONPATH=src python -c "
import logging
logging.basicConfig(level=logging.DEBUG)
from my_sensor_interface import MySensorInterface
iface = MySensorInterface(port='/dev/ttyUSB0')
iface.connect()
print(iface.read_sensor())
iface.disconnect()
"Layer 3: Node Issues¶
Symptoms¶
Node starts but actions fail
Incorrect action parameters
Status endpoint returns errors
Actions hang or timeout
Diagnosis¶
Check node health:
# Health check
curl http://localhost:2000/health
# Node info (shows registered actions)
curl http://localhost:2000/info | python -m json.tool
# Current state
curl http://localhost:2000/state | python -m json.tool
# Node log
curl http://localhost:2000/logTest actions directly:
# List available actions (from /info endpoint)
curl http://localhost:2000/info | python -c "
import sys, json
info = json.load(sys.stdin)
for action in info.get('actions', info.get('capabilities', {}).get('actions', [])):
print(f' {action}')
"
# Execute an action
curl -X POST http://localhost:2000/actions/measure \
-H "Content-Type: application/json" \
-d '{"samples": 1}' | python -m json.toolRun node with verbose logging:
# Set log level via environment
LOG_LEVEL=DEBUG python my_sensor_rest_node.pyCheck for port conflicts:
# Check if port 2000 is already in use
lsof -i :2000 # macOS/Linux
netstat -tlnp | grep 2000 # LinuxCommon Node Issues¶
| Problem | Cause | Solution |
|---|---|---|
Address already in use | Another node or process on same port | Change port in config or stop the other process |
| Action not found | Method missing @action decorator | Add @action decorator to the method |
| Parameter validation error | Type mismatch in action parameters | Check type hints match the JSON being sent |
| Action hangs | Interface method blocks indefinitely | Add timeouts to interface operations |
startup_handler fails | Interface can’t connect | Check hardware connection, or switch to fake interface |
AttributeError: 'NoneType' | Interface not initialized | Ensure startup_handler runs before actions |
Layer 4: Workcell and Workflow Issues¶
Symptoms¶
Workflow steps fail
Node not found by workcell
Location resolution errors
Workflow hangs on a step
Diagnosis¶
Check workcell status:
madsci statusCheck node registration:
from madsci.client import WorkcellClient
wc = WorkcellClient(workcell_server_url="http://localhost:8005/")
nodes = wc.get_nodes()
for name, node in nodes.items():
print(f" {name}: {node.status} at {node.url}")Check workflow execution:
from madsci.client import WorkcellClient
wc = WorkcellClient(workcell_server_url="http://localhost:8005/")
# Get active workflows
active = wc.get_active_workflows()
for wf_id, wf in active.items():
print(f"Workflow {wf_id}: step {wf.status.current_step_index}")
for step in wf.steps:
print(f" {step.name}: {step.status}")
# Get archived (completed) workflows
archived = wc.get_archived_workflows(number=5)
for wf_id, wf in archived.items():
print(f"Workflow {wf_id}: {wf.status}")Check event logs:
# View recent events
madsci logs --tail 50
# Filter by level
madsci logs --level ERROR --tail 20
# Filter by pattern
madsci logs --grep "my_sensor"
# Follow logs in real time
madsci logs --followCommon Workcell Issues¶
| Problem | Cause | Solution |
|---|---|---|
| Node not found | Node not registered with workcell | Check workcell config includes node URL |
| Step fails with connection error | Node URL incorrect or node not running | Verify node is running and URL is reachable from workcell |
| Location not resolved | Location not configured in Location Manager | Add location with LocationClient.add_location() |
| Workflow hangs | Step waiting for locked node | Check node status, unlock if stuck |
| Wrong parameters passed | Workflow YAML parameter names don’t match action | Compare workflow step args with action method signature |
Using the MADSci TUI for Debugging¶
The TUI provides a live view of system status:
madsci tuiKey screens:
Dashboard (
d): Overview of all service healthStatus (
s): Detailed service status tableLogs (
l): Live log viewer with filtering
Debugging Checklist¶
When something isn’t working, follow this systematic checklist:
Can you talk to the hardware directly? (Python REPL, no MADSci)
Does the interface work standalone? (Import and call methods)
Does the fake interface pass all tests? (
pytest tests/test_interface.py)Does the node start without errors? (
python my_sensor_rest_node.py)Can you call actions via curl? (
curl -X POST http://localhost:2000/actions/measure)Is the node registered with the workcell? (
madsci status)Do workflow steps reference the correct action names and parameters?
Are the event logs showing errors? (
madsci logs --level ERROR)
What’s Next?¶
Packaging & Deployment - Docker, dependencies, CI/CD
Publishing - Sharing modules with others