Async vs sync timing model
Graph and Run expose two ways to drive a pipeline:
- Synchronous (
Graph::run()with no inputs): the framework opens the pipeline, runs it to completion, and returns. Control returns once the source has emitted EOS. - Asynchronous (
Run::push()/Run::pull()): the application owns the loop. Work happens whenever there are samples in the queue; control returns to the application between pushes / pulls.
Both modes use the same Nodes, the same plan, and the same hardware. The difference is who drives the clock.
Synchronous mode
sima::Graph graph;
graph.add(sima::nodes::groups::FileMp4H264In("input.mp4"));
graph.add(model.graph());
graph.add(sima::nodes::groups::Mp4FileOut("output.mp4"));
sess.run(); // blocks until EOS or error
This is the simplest mode. The pipeline is a self-contained job: it has its own source (a file) and sink (another file). There are no live inputs or outputs.
Use sync mode for:
- File-to-file conversion or batch inference.
- One-off validation runs.
- Reproducibility tests where you want an entire run as one transaction.
Asynchronous mode
sima::Graph graph;
graph.add(sima::nodes::Push("rgb"));
graph.add(model.graph());
graph.add(sima::nodes::Pull("detections"));
auto run = graph.build();
run.start();
while (have_more_inputs()) {
run.push(make_sample(...));
if (auto out = run.pull(0); out) {
consume(*out);
}
}
run.stop();
The Run is a long-lived runtime. Push samples in via the Push Node (input is InputRole::Push); pull results out via the Pull Node. The framework schedules work as samples arrive.
Use async mode for:
- Live video / RTSP / camera input.
- Stream processing where the application controls cadence.
- Mixing inference with non-framework code (sensor fusion, business logic).
Push timing
Run::push() returns once the sample has been enqueued at the input boundary. It does not wait for the sample to traverse the pipeline. If the input queue is full, push blocks until space is available, governed by RunOptions::push_timeout_ms and the configured OverflowPolicy.
OverflowPolicy::Block (the default) blocks; OverflowPolicy::DropIncoming drops the new
sample; OverflowPolicy::KeepLatest evicts the oldest pending sample. Pick the right policy for
your latency budget.
Pull timing
Run::pull() returns the next sample available at the output boundary. The signatures vary:
pull(timeout_ms = -1)— block indefinitely (default), or up totimeout_ms.pull(0)— non-blocking; returnsnulloptif no sample is ready.pull_or_throw()— likepull()but raises on timeout, for code paths that treat "no sample" as a failure.
The framework does not promise FIFO across multiple inputs streams unless you explicitly ask for it via RunPreset or per-stream queues — see Tutorial 010: feed a multi-input model.
Telemetry — what was the actual latency?
Use Run::start_measurement() around the workload you own. The returned MeasureReport is the
public timing surface and includes:
- End-to-end push-to-output latency and throughput.
- Per-node timing summaries when available.
- Plugin/kernel and edge timing when requested in
MeasureOptions. - Runtime counters and optional power telemetry.
Useful when async mode shows unexpected back-pressure — the report tells you whether time is spent at graph boundaries, nodes, plugins, or queues.
Related types
Graph::run()— synchronous entry point.Graph::build()— async entry point (returns aRun).Run::push()/pull()— async drive methods.OverflowPolicy— back-pressure behavior.MeasureOptions/MeasureReport— measured telemetry.
Further reading
- "Runs and parallelism" — §0.13, §12, §48, §79 of the design deep dive.
- "Async dispatch loop" — internals (§57).