Skip to content

feat:rust native modules#1794

Open
aclauer wants to merge 15 commits intodevfrom
andrew/feat/rust-native-modules
Open

feat:rust native modules#1794
aclauer wants to merge 15 commits intodevfrom
andrew/feat/rust-native-modules

Conversation

@aclauer
Copy link
Copy Markdown
Collaborator

@aclauer aclauer commented Apr 16, 2026

Problem

We need support for Rust native modules to offload performance critical modules.

Closes DIM-XXX

Solution

See native/rust/examples for a Rust native ping pong example

Added NativeModule struct to allow easy publishing and subscribing to topics. NativeModules can use any transport that implements the Transport trait (currently only LCM).

Rust NativeModules also support configuration through stdin instead of cli args. Cli args are kept for backwards compatibility.

Breaking Changes

None

How to Test

For unit tests
cd dimos/native/rust
cargo test

For end to end test with two Rust modules talking over LCM
python3 rust_ping_pong.py

Contributor License Agreement

  • I have read and approved the CLA.

@aclauer aclauer changed the title Andrew/feat/rust native modules feat/rust native modules Apr 16, 2026
@aclauer aclauer changed the title feat/rust native modules feat:rust native modules Apr 16, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 16, 2026

Greptile Summary

This PR adds a Rust native module SDK (dimos-native-module) and updates the Python NativeModule coordinator to send topics and config via stdin JSON (in addition to existing CLI args). The architecture is clean — a Transport trait abstracts the wire protocol, NativeModule<T> manages typed routes and a background tokio task, and a thorough unit-test suite covers the config-parsing logic.

  • P1 — native_module.py line 181: config_dict if config_dict else None coerces an empty dict {} to null. Any Rust module whose Python config fields are all None will receive JSON null instead of {}, causing serde_json to fail deserializing a struct type — even one with all #[serde(default)] fields. The fix is to drop the conditional and always send config_dict.

Confidence Score: 4/5

Safe to merge after fixing the empty-dict-to-null coercion in native_module.py; remaining findings are P2 style/hardening suggestions.

One P1 logic bug: the config_dict if config_dict else None expression silently sends JSON null instead of {} for Rust modules with all-optional config fields, causing a runtime deserialization failure. The two P2 findings (blocking IO in async, missing recv-error backoff) are real but do not affect correctness for the current usage pattern. Overall structure, trait design, and test coverage are solid.

dimos/core/native_module.py — the config serialization logic at line 181 needs the falsy-dict guard removed.

Important Files Changed

Filename Overview
dimos/core/native_module.py Python coordinator that spawns native subprocesses and sends topics/config via stdin JSON; has a P1 logic bug where an empty config dict is coerced to null, which will break Rust modules with all-optional config fields.
native/rust/src/module.rs Core NativeModule struct with stdin config parsing, typed route dispatch, and a tokio background loop; blocking I/O in from_stdin and no backoff on recv errors are P2 concerns; unit-test coverage is good.
native/rust/src/transport.rs Clean trait abstraction over the message transport; RPITIT pattern is appropriate for async trait methods in Rust 2021.
native/rust/src/lcm.rs Thin wrapper delegating to dimos-lcm; correctly maps LCM message to (channel, data) tuple.
native/rust/src/lib.rs Public re-exports; re-exports LcmOptions so callers avoid a direct dimos-lcm dependency.
native/rust/examples/native_ping.rs Ping example; publishes Twist at 5 Hz and logs echoes; straightforward use of the NativeModule API.
native/rust/examples/native_pong.rs Pong example; receives Twist and echoes with config value embedded; correctly uses serde deny_unknown_fields.
native/rust/examples/rust_ping_pong.py Python orchestrator for the ping-pong end-to-end test; cleanly declares two NativeModule subclasses and wires them with autoconnect.
native/rust/Cargo.toml Declares dimos-lcm from a git branch; acceptable for a new crate, though a published crate or path dependency would be more stable long-term.

Sequence Diagram

sequenceDiagram
    participant Py as NativeModule (Python)
    participant Proc as Rust Process
    participant BG as Background Task (tokio)
    participant LCM as LCM Bus

    Py->>Proc: spawn(executable + CLI args)
    Py->>Proc: stdin: {"topics": {...}, "config": {...}}
    Proc->>Proc: from_stdin() — parse topics & config
    Proc->>Proc: module.input() / module.output() — register routes
    Proc->>BG: module.spawn() — start select! loop

    loop message loop
        LCM-->>BG: transport.recv() -> (channel, data)
        BG->>BG: match route.topic() -> try_dispatch(data)
        BG-->>Proc: mpsc channel delivers typed message
        Proc->>BG: Output::publish(msg) -> mpsc send
        BG->>LCM: transport.publish(topic, data)
    end

    Py->>Proc: SIGTERM (on stop())
    Proc-->>Py: exit
Loading

Comments Outside Diff (1)

  1. dimos/core/native_module.py, line 14-16 (link)

    P2 Module docstring still says "C/C++" only

    The module-level docstring describes NativeModule as "a thin Python Module subclass … for native (C/C++) executables". Now that Rust is a first-class supported language (with its own SDK and examples), the docstring should reflect that.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Reviews (1): Last reviewed commit: "Pre commit" | Re-trigger Greptile

Comment thread dimos/core/native_module.py Outdated
Comment on lines +181 to +182
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Empty dict coerced to null, breaking structs with all-optional fields

config_dict if config_dict else None converts an empty dict {} to None, which serialises to JSON null. Any Rust module whose Python-side config fields are all None (e.g. optional_flag: int | None = None) will produce an empty dict and cause the Rust from_stdin::<PongConfig>() call to fail with "failed to deserialize config" — because serde_json::from_value(Value::Null) cannot produce a struct type, even one with all #[serde(default)] fields.

The conditional should be removed so the dict (possibly {}) is always forwarded:

Suggested change
)
stdin_blob = json.dumps(
{"topics": topics, "config": config_dict}
).encode() + b"\n"

Comment thread native/rust/src/module.rs
Comment on lines +163 to +165
let mut line = String::new();
io::stdin().lock().read_line(&mut line)?;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Blocking I/O inside an async function

io::stdin().lock().read_line(&mut line)? is a synchronous, blocking call. Inside a tokio::main context it occupies a worker thread for the duration of the read. Although the pipe already has data when Python spawns the process (so the block is brief in practice), Tokio's documentation recommends wrapping blocking calls with spawn_blocking or using the async-aware tokio::io::AsyncBufReadExt alternative:

use tokio::io::{AsyncBufReadExt, BufReader};

let mut stdin = BufReader::new(tokio::io::stdin());
let mut line = String::new();
stdin.read_line(&mut line).await?;

Comment thread native/rust/src/module.rs
Comment on lines +228 to +248
loop {
tokio::select! {
result = transport.recv() => match result {
Ok((channel, data)) => {
for route in &routes {
if route.topic() == channel {
route.try_dispatch(&data);
}
}
}
Err(e) => eprintln!("dimos_module: recv error: {e}"),
},
Some((topic, data)) = publish_rx.recv() => {
if let Err(e) = transport.publish(&topic, &data).await {
eprintln!("dimos_module: publish error on {topic}: {e}");
}
}
}
}
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Busy-loop on persistent transport errors

When transport.recv() returns an Err the error is logged and the loop immediately retries with no delay. If the underlying LCM socket enters a persistent error state (e.g. network interface gone), this will saturate a CPU core with error prints. Consider adding a short backoff or a consecutive-error counter:

Err(e) => {
    eprintln!("dimos_module: recv error: {e}");
    tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
}

Comment thread dimos/core/native_module.py Outdated
stderr=subprocess.PIPE,
)
assert self._process.stdin is not None
self._process.stdin.write(stdin_blob)
Copy link
Copy Markdown
Contributor

@leshy leshy Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, clean, will just mess with existing native modules, can we have a toggle? maybe a class itself can have an attribute on nativemodule version, new version expects stdin blob, old one, args

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure i can add a field to the native module config to tell it if it should read from cli args or stdin for the configs. (default to cli args for now, but we should move any existing modules to stdin asap, should be a pretty minor change)

@@ -0,0 +1,40 @@
// NativeModule ping example.
//
Copy link
Copy Markdown
Contributor

@leshy leshy Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to dimos/examples/ dir? so we that don't create more example dirs

slight issue - dimos/examples/language-interop has examples of totally external scripts talking to dimos, this is doing actual native modules (I assume proper way going forward), so should reformat the examples/ dir somehow to make sense

direct dimos vs native-module dimos interop examples? idk, you can decide. we'll point new devs / users to whatever you write here - so in your interest to make it nice otherwise they'll talk to you :)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll move it over. My vision is to standardize the native module sdk across all the languages to match the Rust implementation, and then examples/native-modules can have an example module of each one and show them being coordinated by dimos.

Copy link
Copy Markdown
Contributor

@leshy leshy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small tweaks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants