DDS in Robotics Architecture: Real-Time Communication

Data Distribution Service (DDS) has become the dominant publish-subscribe middleware standard underpinning real-time communication in modern robotics systems, particularly since its adoption as the default transport layer in ROS 2. This page describes the DDS specification, its operational mechanics, the deployment scenarios where it applies, and the architectural decision boundaries that determine when DDS is appropriate versus when alternatives serve better. Professionals designing robotics architecture across industrial, mobile, and autonomous domains encounter DDS as a foundational infrastructure choice with significant consequences for latency, scalability, and safety certification.

Definition and scope

DDS is a data-centric publish-subscribe (DCPS) standard defined by the Object Management Group (OMG) under specification DDS 1.4. The standard defines a distributed middleware layer that allows software components — called participants — to exchange typed data through a shared logical data space called the Global Data Space. Unlike message-passing systems that route through a central broker, DDS operates broker-less: publishers and subscribers discover each other through a built-in discovery protocol and communicate directly over the network.

The OMG DDS specification encompasses two core layers:

  1. Data-Centric Publish-Subscribe (DCPS) — the primary API layer where applications declare topics, data types, publishers, and subscribers.
  2. Data Local Reconstruction Layer (DLRL) — an optional object-oriented overlay that maps the data space to application objects.

Scope within robotics extends from intra-process communication on a single embedded board to multi-node communication across a fleet of robots sharing a network segment. The ROS 2 architecture mandates DDS as its middleware layer, with implementations including Eclipse Cyclone DDS, eProsima Fast DDS, and RTI Connext DDS validated against the RMW (ROS Middleware) abstraction interface.

How it works

DDS communication is organized around three structural primitives: Topics, DataWriters, and DataReaders. A Topic defines a named, typed data channel. A DataWriter publishes data to a Topic; a DataReader subscribes to it. Discovery between participants uses the Simple Discovery Protocol (SDP), which subdivides into the Simple Participant Discovery Protocol (SPDP) and the Simple Endpoint Discovery Protocol (SEDP), both specified in the OMG Real-Time Publish-Subscribe (RTPS) Wire Protocol standard, version 2.3.

Quality of Service (QoS) policies are the primary mechanism through which DDS adapts to real-time constraints. The OMG DDS specification defines 22 distinct QoS policies. The most architecturally significant include:

  1. ReliabilityBEST_EFFORT drops messages under congestion; RELIABLE guarantees delivery with retransmission.
  2. Durability — controls whether late-joining subscribers receive historical data (VOLATILE, TRANSIENT_LOCAL, TRANSIENT, PERSISTENT).
  3. Deadline — specifies the maximum period between successive data publications; violations trigger listener callbacks.
  4. Liveliness — defines how participants assert they remain active and how quickly failure is detected.
  5. History — governs whether only the last sample (KEEP_LAST) or all samples (KEEP_ALL) are retained in the middleware queue.
  6. Latency Budget — provides a hint to the transport about acceptable end-to-end delay, enabling transport-layer optimization.

Transport binding is configurable. DDS implementations typically support UDP/IP multicast for local discovery, UDP/IP unicast for point-to-point data delivery, and shared memory transport for intra-host communication. Shared memory paths in implementations like Eclipse Cyclone DDS reduce serialization overhead and can achieve sub-millisecond latency on loopback paths, which is critical for real-time operating system integration.

Common scenarios

DDS appears across robotics deployment contexts where decentralized, typed, and QoS-governed communication is required.

Industrial automation and collaborative roboticsIndustrial robotics architectures use DDS to coordinate motion controllers, vision systems, and safety monitors on deterministic networks. The DEADLINE QoS policy enforces cycle-time contracts, and RELIABLE transport ensures that safety-critical state transitions are not dropped. This aligns with the communication integrity requirements described under IEC 61508 functional safety frameworks.

Autonomous mobile robots and fleets — In multi-robot system architectures, DDS multicast discovery enables zero-configuration peer detection when all robots share a common subnet. A fleet of 20 autonomous mobile robots (AMRs) in a warehouse can form a DDS domain where each robot publishes pose and status topics that all peers receive without broker provisioning.

Surgical and safety-critical roboticsSurgical robotics architectures require deterministic latency between haptic feedback loops and actuator commands. DDS BEST_EFFORT with bounded queue depths provides predictable worst-case jitter profiles, and LIVELINESS policies detect component failure within a configurable lease duration — typically configured between 100 milliseconds and 2 seconds in clinical-grade systems.

Sensor fusion pipelines — In sensor fusion architectures, DDS topics carry timestamped point clouds, IMU readings, and camera frames from multiple producers to a centralized or distributed fusion node. KEEP_LAST history with depth 1 minimizes memory footprint on resource-constrained edge nodes.

Decision boundaries

DDS is not universally optimal. The architectural decision between DDS and alternative transports involves measurable trade-offs.

DDS vs. ROS 1 TCPROS/UDPROS — ROS 1 relied on a central rosmaster node for name resolution. Loss of the master process terminates all topic discovery. DDS eliminates this single point of failure through distributed SPDP, which is a primary reason the ROS 2 architecture adopted it. However, DDS introduces higher baseline memory consumption — Fast DDS documentation reports a minimum footprint of approximately 1.5 MB per participant on Linux, making it unsuitable for microcontrollers below that threshold.

DDS vs. MQTT or custom serial protocolsEmbedded systems architectures on microcontrollers (ARM Cortex-M class, under 256 KB RAM) cannot host a full DDS stack. Micro-XRCE-DDS, standardized by OMG as the DDS for eXtremely Resource Constrained Environments specification, bridges these devices to a full DDS agent running on a host processor, preserving QoS semantics across the boundary.

Domain segmentation — DDS participants are grouped by numeric Domain ID. Two participants on the same network with different Domain IDs cannot communicate directly. This provides logical isolation for safety architecture partitioning, where a safety-rated subsystem operates on a dedicated domain isolated from non-safety components.

Security overhead — The OMG DDS Security specification (version 1.1) defines authentication, access control, and cryptographic protection plugins. Enabling full DDS Security with AES-128-GCM encryption on all topics imposes measurable CPU overhead — RTI documentation indicates encryption can add 15–30% processing overhead on constrained processors — a factor that must be weighed in cybersecurity-aware robotics architectures.


References