Integrating your Devices with MADSci Nodes

This notebook aims to teach you how to automate and integrate all the devices, instruments, sensors, and robots in your self-driving lab using the MADSci Node standard.

Goals¶

After completing this notebook, you should understand

What we mean when we talk about a MADSci Node
The MADSci Node Standard
How to integrate and automate a device using the MADSci Node standard
How to use the RestNode python class included in madsci.node_module to integrate a MADSci Node
How to leverage the built-in MADSci client management through the MadsciClientMixin

What is a Node?¶

In MADSci, a Node refers to a single instrument, sensor, robot, or other device, combined with the software needed to control, operate, automate, and integrate it into the Automated or Autonomous Lab as a whole.
Node’s accept Action Requests, and return Action Results. They also report State and Status, Info about themselves, log Events, and can optionally use and manage Resources.

Diagram illustratring how a node intermediates between a User and different devices

Anatomy of a Node¶

A Node typically consists of the following sub-components:

A physical device (robot, instrument, sensor, etc.)
A driver, API, library, or software application for communicating with that device, often provided by the hardware vendor
A device interface class that handles the neccessary initialization, communication, and cleanup required to use the given device
The node implementation, typically a class, which uses the interface to handle the execution of the actions and the lifecycle of the node (state, status, statup, shutdown, etc.).
The node definition and configuration, which define specific details about a given instance of a node and how it should. We typically implement these as .YAML files
The node server and node client, which allow for standardized control of nodes. These are implementations of the MADSci Node standard interface, and operate using standard protocols (currently, REST-based HTTP)

Diagram of the components of a MADSci Node

Node Instances vs. Node Modules¶

Sometimes, we need to disambiguate between “this particular Node, tied to a specific device” and “this class of Node, tied to this type of device”. We usually refer to the former as a Node Instance, and the latter as a Node Module when we need to be precise.

For example, we might have one “Opentrons OT-2” Node Module that we’ve written, which we then use to run 2 different Node Instances in the lab: “OT-2 Alpha” and “OT-2 Beta”

# Install dependencies
%pip install madsci.node_module madsci.client httpx

Defining A Node¶

While you can build a MADSci compliant Node entirely from scratch, that’s a lot of work. To save you some time and effort, we’ve implemented our own RestNode python class that you can inherit from as the basis for defining a device’s Node.

In the rest of the notebook, we’ll demonstrate an example of integrating a fake “robot arm” as a MADSci Node Module using our RestNode class.

Node Definition Files¶

In addition to the code that makes up a Node Module’s implementation, an individual instance of a Node has a NodeDefinition, which includes all the configuration and information for a single, specific instance of a device. We generally store this definition as a .yaml file that is then passed as an argument when we start the Node server.

These NodeDefinitions allow you to easily differentiate between and configure multiple Node Instances of a given Node Module

from madsci.common.types.node_types import NodeDefinition, NodeType
from madsci.node_module.rest_node_module import RestNode
from rich import print

class ExampleRobotNode(RestNode):
    """Define an Example Robot Node. It doesn't do anything yet, but it's a good starting point."""

    def startup_handler(self) -> None:
        """Demonstrate access to MADSci clients during startup."""
        # Log startup using the built-in event_client
        self.event_client.info("Example robot node starting up")

        # You can also access other clients as needed:
        # self.resource_client - for resource management
        # self.data_client - for data operations
        # Clients are automatically configured based on the current MADSci context


# if __name__ == "__main__":
#     example_node = ExampleRobotNode() #noqa
#     example_node.start_node() #noqa
#
# Then, run `python example_robot_node.py --definition <path/to/example_robot.node.yaml` to start the node

# In most cases, this node definition is a .yaml file that we pass as the "--definition" argument at runtime,
# rather than defining it in code.
node_definition = NodeDefinition(
    node_name="example_robot_1",
    module_name="example_robot_module",
    node_type=NodeType.DEVICE,
    node_description="An example node for controlling our fake robot arm.",
)
example_node = ExampleRobotNode(node_definition=node_definition)
# Normally, `start_node` starts a http server and begins listening for incoming requests.
# However, in this case, we are using the `testing` argument to "start the node" without actually starting the server.
example_node.start_node(testing=True)

print(node_definition)

# Demo Magic to avoid having to actually run rest servers
# This is a mock server that simulates the behavior of the actual server.
# It allows us to test the client without needing to run the server.
# This is useful for unit testing and debugging, or running a demo
import contextlib
from collections.abc import Generator
from typing import Any
from unittest.mock import patch

from fastapi.testclient import TestClient
from madsci.client.node.rest_node_client import RestNodeClient


@contextlib.contextmanager
def node_server(node: RestNode) -> Generator[TestClient, None, None]:
    """Mock server context manager."""

    test_client = TestClient(node.rest_api)

    with test_client as requests:
        # Mock the server's behavior
        yield requests


@contextlib.contextmanager
def node_client(
    node: RestNode, client: RestNodeClient
) -> Generator[RestNodeClient, None, None]:
    """Mock client context manager."""

    with node_server(node) as requests:
        # Patch the session methods directly instead of the requests module
        original_post = client.session.post
        original_get = client.session.get

        def post_no_timeout(*args: Any, **kwargs: Any) -> Any:
            kwargs.pop("timeout", None)
            return requests.post(*args, **kwargs)

        def get_no_timeout(*args: Any, **kwargs: Any) -> Any:
            kwargs.pop("timeout", None)
            return requests.get(*args, **kwargs)

        client.session.post = post_no_timeout
        client.session.get = get_no_timeout

        try:
            yield client
        finally:
            # Restore original methods
            client.session.post = original_post
            client.session.get = original_get

Node Configuration and Device Interfaces¶

In addition to the Node Class and the NodeDefinition, there are two other important things we may want to define in order to integrate a device.

Node Config¶

The NodeConfig class is a class we define for a Node that, as the name implies, defines the configuration available for a specific Node. This often includes things like device identifiers or IP addresses, initialization arguments to pass to the device’s driver, or details about the specific device being controlled. It also includes common configurations for the node itself, such as how to configure the REST server for the node.

We define our NodeConfig as a Pydantic Dataclass, and have a RestNodeConfig we can use as a base class with default configuration for the REST server.

One of the advantages of defining our NodeConfig in this way is that we can set configuration values:

As part of the NodeDefinition, using the config_overrides field
Using command-line parameters, when we start the Node
Using the set_config endpoint on a running Node.

Device Interface¶

Many of the devices we need to automate have their own DLL’s, API’s, Libraries, or other means of controlling and communicating with them. The exact details, capabilities, and implementations vary wildly from device to device, but in most cases these are complex enough to justify their own wrapper class.

Implementing the device specific logic as a standalone class provides two key advantages:

We can use the device interface standalone, for testing and debugging, maintenance, or in applications where we can’t or don’t want to use a full Node.
We separate the concerns of managing the node from the nuts and bolts of interfacing with the device.

This isn’t always neccessary (some devices have API’s so simple it simply isn’t worth the effort), but is often worthwhile for complex devices or unwieldy API’s.

from madsci.common.types.node_types import RestNodeConfig


class ExampleRobotConfig(RestNodeConfig):
    """Example Configuration options for our ExampleRobotNode."""

    robot_number: int = 0
    """An identifier for the robot we are controlling."""

import random
import time


class ExampleRobotInterface:
    """Example Robot Interface. This is a simple interface for controlling a (fake) robot arm."""

    robot_number: int = 0
    """An identifier for the robot we are controlling."""
    joint_angles: list[float] = [0.0, 0.0, 0.0, 0.0]  # noqa
    """The joint angles of the robot."""
    gripper_closed: bool = False
    """The state of the gripper, open or closed."""
    is_moving: bool = False
    """Whether the robot is currently moving."""

    def __init__(self, robot_number: int = 0) -> None:
        """Initialize the ExampleRobotInterface with a robot number."""
        self.robot_number = robot_number

    def get_robot_number(self) -> int:
        """Get the robot number."""
        return self.robot_number

    def move_to_joint_angles(self, angles: list[float]) -> None:
        """Move the robot to the specified joint angles."""
        if self.is_moving:
            raise RuntimeError("Robot is already moving.")
        if len(angles) != 4:
            raise ValueError("Expected 4 joint angles.")
        self.is_moving = True
        time.sleep(5)
        self.is_moving = False
        self.joint_angles = angles

    def close_gripper(self) -> None:
        """Close the gripper."""
        if self.is_moving:
            raise RuntimeError("Robot is already moving.")
        self.is_moving = True
        time.sleep(1)
        self.is_moving = False
        self.gripper_closed = True

    def open_gripper(self) -> None:
        """Open the gripper."""
        if self.is_moving:
            raise RuntimeError("Robot is already moving.")
        self.is_moving = True
        time.sleep(1)
        self.is_moving = False
        self.gripper_closed = False

    def get_force_sensor_value(self) -> int:
        """Get the current value of the force sensor."""
        # For this example, we'll just return a random value.
        return random.randint(0, 100)

Node Info¶

A node’s way of introducing itself

A useful part of the MADSci Node Standard is the NodeInfo: a description of the node, it’s current configuration and capabilites, and just about everything else you might reasonably want to know about it. The RestNode python class does it’s best to automatically generate as much of the Node Info as possible, so that users, agents, and other components of the system can easily understand what the capabilities and functionality of a given node.

with node_client(example_node, RestNodeClient(url="http://localhost:2000")) as client:
    # Example of a request to the node
    print(client.get_info())

Node Lifecycle¶

Generally speaking, nodes have the following common lifecycle components:

start_node: this default function kickoffs the REST server, reads in the node definition and config, and calls the user defined startup_handler
startup_handler: a user-defined function that handles whatever logic needs to be done to initialize the underlying device and prepare the node for operation. The node won’t start accepting actions until the startup_handler completes.
shutdown_handler: a user-defined function that handles the logic of shutting the node down: disconnecting from the device, cleaning up system resources, etc.
state_handler: allows the node to update it’s published state, a JSON serializable dictionary
status_handler: allows the node to implement any custom logic around updating the NodeStatus used by the node and the rest of the system to understand the condition of the node.

The startup_handler and shutdown_handler generally run once, while the state_handler and status_handler run periodically, with a configurable interval.

Green represents developer-defined functionality, blue is built-in to the REST Node class

from typing import Optional


class RobotNodeWithLifecycle(ExampleRobotNode):
    """Define an Example Node with a startup and shutdown handlers."""

    config: ExampleRobotConfig = ExampleRobotConfig()
    """The configuration model for the node."""
    robot_interface: Optional[ExampleRobotInterface] = None
    """The robot interface for controlling the robot."""

    def startup_handler(self) -> None:
        """Handle the startup of the node."""
        # Use the built-in event_client for logging instead of self.logger
        self.event_client.info(f"Connecting to robot {self.config.robot_number}...")
        self.robot_interface = ExampleRobotInterface(self.config.robot_number)
        self.event_client.info(
            f"Connected to robot {self.robot_interface.get_robot_number()}"
        )

    def shutdown_handler(self) -> None:
        """Handle the shutdown of the node."""
        self.event_client.info(
            f"Disconnecting from robot {self.config.robot_number}..."
        )
        del self.robot_interface
        self.event_client.info(f"Disconnected from robot {self.config.robot_number}")

example_node = RobotNodeWithLifecycle(node_definition=node_definition)
example_node.start_node(testing=True)

with node_client(example_node, RestNodeClient(url="http://localhost:2000")) as client:
    while not client.get_status().ready:
        # Wait for the node to be ready
        time.sleep(0.1)
    print(client.get_status())

import time


class RobotNodeWithUpdates(RobotNodeWithLifecycle):
    """Define an Example Node that periodically updates it's status and public-facing state."""

    node_state = {  # noqa
        "joint_angles": [0.0, 0.0, 0.0, 0.0],
    }

    def state_handler(self) -> None:
        """This is where you can implement logic to periodically update the node's public-facing state information."""
        if self.robot_interface is not None:
            self.node_state = {"joint_angles": self.robot_interface.joint_angles}
        else:
            self.node_state = {"joint_angles": None}

    def status_handler(self) -> None:
        """
        This is where you can implement logic to periodically update the node's status information.
        You can also use the MADSci clients here for logging or data operations.
        """
        if self.robot_interface is not None and self.robot_interface.is_moving:
            self.node_status.busy = True
            # Log status changes using the event_client
            self.event_client.debug("Robot is moving, node status set to busy")
        else:
            self.node_status.busy = len(self.node_status.running_actions) > 0


example_node = RobotNodeWithUpdates(
    node_definition=node_definition,
    node_config=ExampleRobotConfig(
        state_update_interval=2,  # Change how frequently, in seconds, the node state is updated
        status_update_interval=0.5,  # Change how frequently, in seconds, the node status is updated
    ),
)
example_node.node_status.errors = []
example_node.start_node(testing=True)

with node_client(example_node, RestNodeClient(url="http://localhost:2000")) as client:
    while not client.get_status().ready:
        # Wait for the node to be ready
        time.sleep(0.1)
    print(client.get_state())
    print(client.get_status())
    example_node.robot_interface.move_to_joint_angles([0.5, 0.5, 0.5, 0.5])
    time.sleep(
        2.0
    )  # -> Uncomment this line to wait for the updated state & status to be set
    print(client.get_state())
    print(client.get_status())

Node Actions¶

Now that we have the lifecycle of the Node working, let’s talk about Actions. In MADSci, a Node Action is, essentially, a function that you can call on the node in a standardized way. These functions typically, but not neccessarily, involve the device the node controls taking some action (e.g. a camera taking a picture, a robot arm moving, a plate reader doing a reading, etc.).

Anatomy of an Action¶

Node Actions can be defined with the following:

Name: a unique name for the action to perform
Description: a human-readable description of the action
Arguments: the arguments the action takes. Normal arguments must be JSON-serializable types, and can be either required or optional. There are two special cases of arguments:
- File Arguments: files the action accepts as inputs, which are uploaded by the client
- Location Arguments: these are functionally the same as regular Arguments, but have special meaning when using a node as part of a MADSci workcell (more on that later)
Results: these are what an action returns after completing, typically in the form of JSON Data, Files, or Datapoint IDs

To perform an action with a node, you send an ActionRequest, and the node returns ActionResults capturing the progress and final outcomes of the result.

Client-Server Action Methods¶

Lifecycle of an Action¶

from madsci.common.types.action_types import (
    ActionFailed,
    ActionRequest,
)
from madsci.common.types.node_types import NodeStatus
from madsci.node_module.helpers import action


class RobotNodeWithAction(RobotNodeWithUpdates):
    """Define an example robot node with an action."""

    node_status = NodeStatus()

    @action
    def move_joints(self, joint_angles: list[float]) -> None:
        """
        An example action: moving the robot to a set of joint angles.
        """
        if self.robot_interface is None:
            self.event_client.error("Robot interface not initialized")
            return ActionFailed(errors="Robot interface not initialized")
        if self.robot_interface.is_moving:
            self.event_client.error("Robot is already moving")
            return ActionFailed(errors="Robot is already moving")
        if len(joint_angles) != 4:
            self.event_client.error("Invalid number of joint angles. Expected 4.")
            return ActionFailed(errors="Invalid number of joint angles. Expected 4.")
        self.robot_interface.move_to_joint_angles(joint_angles)
        self.event_client.info(f"Moved robot to joint angles: {joint_angles}")
        return None

    # This action returns an integer value
    @action
    def use_force_sensor(self, joint_angles: list[float]) -> int:
        """
        An example action that returns a value: moving the robot to a set of joint angles and using the force sensor.
        """
        if self.robot_interface is None:
            self.event_client.error("Robot interface not initialized")
            return ActionFailed(errors="Robot interface not initialized")
        if self.robot_interface.is_moving:
            self.event_client.error("Robot is already moving")
            return ActionFailed(errors="Robot is already moving")
        if len(joint_angles) != 4:
            self.event_client.error("Invalid number of joint angles. Expected 4.")
            return ActionFailed(errors="Invalid number of joint angles. Expected 4.")
        self.robot_interface.move_to_joint_angles(joint_angles)
        self.event_client.info(f"Moved robot to joint angles: {joint_angles}")

        # Store the sensor reading as data using the data_client
        sensor_value = self.robot_interface.get_force_sensor_value()
        try:
            # You can also use the data_client to store results
            datapoint = self.data_client.add_datapoint(
                data_value=sensor_value,
                data_name="force_sensor_reading",
                description="Force sensor reading after robot movement",
            )
            self.event_client.info(
                f"Stored force sensor reading as datapoint: {datapoint.datapoint_id}"
            )
        except Exception as e:
            self.event_client.warning(f"Could not store datapoint: {e}")

        return sensor_value


# See MADSci/example_lab/modules/advanced_example_node.py for more complex examples including returning files and


example_node = RobotNodeWithAction(node_definition=node_definition)
example_node.start_node(testing=True)

with node_client(example_node, RestNodeClient(url="http://localhost:2000")) as client:
    while not client.get_status().ready:
        # Wait for the node to be ready
        time.sleep(0.1)
    request = ActionRequest(
        action_name="move_joints",
        args={"joint_angles": [0.5, 0.5, 0.5, 0.5]},
    )
    # Send the action request to the node
    print(client.send_action(request))
    time.sleep(2)
    print(client.get_state())

Client Integration in Actions¶

Notice how in the action above, we use the built-in clients:

self.event_client.info() for logging action progress
self.data_client.add_datapoint() for storing measurement results
self.event_client.error() and self.event_client.warning() for error handling

This demonstrates how nodes can seamlessly integrate with the broader MADSci ecosystem to:

Log all activities for monitoring and debugging
Store experimental data for later analysis
Track resource usage (if needed)
Coordinate with the overall experiment management system

# What if we forget an argument?
with node_client(example_node, RestNodeClient(url="http://localhost:2000")) as client:
    while not client.get_status().ready:
        # Wait for the node to be ready
        time.sleep(0.1)
    request = ActionRequest(
        action_name="move_joints", args={"joint_angles": [0.5, 0.5, 0.5, 0.5]}
    )
    # Send the action request to the node
    print(client.send_action(request))

# What if the action fails?
with node_client(example_node, RestNodeClient(url="http://localhost:2000")) as client:
    while not client.get_status().ready:
        # Wait for the node to be ready
        time.sleep(0.1)
    request = ActionRequest(
        action_name="move_joints",
        args={"joint_angles": [0.5, 0.5, 0.5]},
    )
    # Send the action request to the node
    print(client.send_action(request))

# What does the node info look like?
with node_client(example_node, RestNodeClient(url="http://localhost:2000")) as client:
    while not client.get_status().ready:
        # Wait for the node to be ready
        time.sleep(0.1)
    # Example of a request to the node
    print(client.get_info())

Admin Commands¶

Admin commands provide a standardized interface for common administrative tasks used to control nodes. Unlike actions, admin commands have pre-defined behavior, and the node developer chooses whether and how to implement a specific admin command for a given device in the node. The MADSci Node Standard supports the following admin commands:

pause: pause the current action, if any
resume: resume the current action, if any
lock: refuse to accept new actions
unlock: resume accepting new actions
cancel: cancel the current action
reset: clear errors, reinitialize the node, and reset the status
stop: a more aggressive cancellation--halt immediately and trigger any safety/e-stop of the device
shutdown: disconnect from the device and stop the node’s server

# * Admin Command Example: Locking, Unlocking, and Reseting the Node
with node_client(example_node, RestNodeClient(url="http://localhost:2000")) as client:
    while not client.get_status().ready:
        # Wait for the node to be ready
        time.sleep(0.1)
    print(client.send_admin_command("lock"))
    request = ActionRequest(
        action_name="move_joints",
        args={"joint_angles": [0.5, 0.5, 0.5, 0.5]},
    )
    # Send the action request to the node
    print(client.send_action(request))
    print(client.send_admin_command("unlock"))
    # Send the action request to the node again
    print(client.send_action(request))

    # * Admin Command Example: Resetting the Node
    print(client.send_admin_command("reset"))
    while not client.get_status().ready:
        # Wait for the node to be ready
        time.sleep(0.1)
    time.sleep(3)
    print(client.get_state())