Diagnose and Profile a Pipeline
| Field | Value |
|---|---|
| Difficulty | Intermediate |
| Estimated Read Time | <10 minutes |
| Labels | diagnostics, debugging, observability |
When a pipeline misbehaves, the temptation is to jump straight into element-level debugging. This chapter teaches the cheaper first move: a repeatable triage pass that answers three questions in order — Is the graph contract valid? Does one run succeed? What do the runtime diagnostics say? It catches most misconfiguration in seconds, before it becomes a multi-hour session, and it works on the same minimal Input → Output graph you already know from chapter 004.
By the end you will have validated a graph's contract, run a single measured frame, and printed the measurement report that tells you whether the pipeline is healthy.
Walkthrough
Validate the contract
validate() is a contract-level check that runs before build(). It exercises the node order, caps, and backend parse path without streaming any data, and returns a report carrying a canonical error_code. An empty/ok code means the graph is structurally sound; anything else buckets the failure (see the error taxonomy below) so you know where to look. Running this first means you never waste time debugging runtime behavior on a graph that was never going to build.
// validate() checks the Graph before build() and prints any caps problems.
auto report = graph.validate();
std::cout << "validate.error_code=" << report.error_code << "\n";
Run one measured frame
Next, build and run a single deterministic frame inside a start_measurement() window. output_memory = Owned asks for owned output buffers so the result stays valid after the call. One frame is enough: if it succeeds, the pipeline is live; if it throws, the exception carries a structured report you can bucket the same way as validate().
// Build a reusable runner and measure the caller-owned workload.
simaai::neat::RunOptions run_opt;
run_opt.output_memory = simaai::neat::OutputMemory::Owned;
auto run = graph.build(std::vector<cv::Mat>{rgb}, run_opt);
simaai::neat::MeasureOptions measure_opt;
measure_opt.title = "tutorial 011 diagnosis";
auto scope = run.start_measurement(measure_opt);
simaai::neat::TensorList out = run.run(std::vector<cv::Mat>{rgb}, /*timeout_ms=*/1000);
if (out.empty())
throw std::runtime_error("missing output tensor");
const simaai::neat::MeasureReport measured = scope.stop();
Read the runtime diagnostics
With one run on record, the MeasureReport summarizes the pipeline's health: counters (inputs_enqueued, outputs_pulled, drops), end-to-end latency, node metrics, plugin/kernel timing, edge timing, and optional power. MeasureReport::to_text() is the baseline you capture before escalating to probes and DOT graphs described in In Practice.
// Post-run diagnostics come from the measurement report.
std::cout << "measure.inputs_enqueued=" << measured.counters.inputs_enqueued
<< " outputs_pulled=" << measured.counters.outputs_pulled << "\n";
std::cout << "measure.text_size=" << measured.to_text().size() << "\n";
Run
Run it and you should see the validate code and measurement report printed to stdout. Run the Python and C++ (prebuilt) commands from the Neat install root (the directory that contains share/ and lib/); run the build from source commands from the repo root. This chapter needs no model archive.
C++ (prebuilt):
./lib/sima-neat/tutorials/tutorial_012_diagnose_a_pipeline
C++ (build from source):
./build.sh --target tutorial_012_diagnose_a_pipeline
./build/tutorials-standalone/tutorial_012_diagnose_a_pipeline
Expected output (counter values and the summary string vary by run):
validate.error_code=
measure.inputs_enqueued=1 outputs_pulled=1
measure.text_size=...
[OK] 012_diagnose_a_pipeline
(The Python build prints validate_error_code=, inputs_enqueued=... outputs_pulled=..., and measure_text_size=....) To integrate this chapter's C++ source into your own project with a custom CMakeLists.txt (no extras folder required), see How to Run Tutorials on the landing page.
In Practice
Structured diagnostics, the error taxonomy, debug knobs, and the plugin-failure workflow you reach for when validate() / start_measurement() / MeasureReport point at a problem.
GraphReport
GraphReport captures structured diagnostics:
- pipeline string (for reproduction)
- canonical
error_code(machine triage) repro_note(human summary + hint)- node reports and owned element names
- bus messages and error details
- optional flow/timing counters
When an error occurs, NeatError carries a GraphReport you can log or serialize.
Error taxonomy
Framework errors use stable code families:
| Error code | Meaning | Typical fix |
|---|---|---|
misconfig.pipeline_shape | Node order/shape contract violation | Ensure Input() first for push pipelines and Output() last for pull pipelines |
misconfig.caps | Caps negotiation/override mismatch | Align caps_override, format, and downstream caps |
misconfig.input_shape | Input tensor/frame/sample shape/layout mismatch | Validate width/height/depth, layout, dtype, storage |
build.parse_launch | gst_parse_launch failed | Validate fragment syntax and plugin availability |
runtime.pull | Runtime pull/timeout/closed-output failure | Check sink output production, queue pressure, and upstream errors |
io.parse | Saved-graph JSON parse/schema failure | Validate JSON and required node fields |
io.open | Graph save/load file open/read/write failure | Check path existence, permissions, and storage health |
PullError.code uses the same taxonomy (not only exception paths).
Programmatic handling
#include "pipeline/ErrorCodes.h"
#include "pipeline/NeatError.h"
try {
auto run = graph.build(input);
simaai::neat::Sample out;
simaai::neat::PullError perr;
const auto st = run.pull(500, out, &perr);
if (st == simaai::neat::PullStatus::Error &&
perr.code == simaai::neat::error_codes::kRuntimePull) {
// runtime pull triage path
}
} catch (const simaai::neat::NeatError& e) {
if (e.report().error_code == simaai::neat::error_codes::kParseLaunch) {
// build/parse-launch triage path
}
}
Debug knobs (environment)
Key environment variables (see Architecture for detail):
SIMA_GST_DOT_DIR: write DOT graphs for failuresSIMA_GST_BOUNDARY_PROBES: boundary flow countersSIMA_GST_ELEMENT_TIMINGS: per-element timingsSIMA_GST_FLOW_DEBUG: per-element flow countersSIMA_GST_ENFORCE_NAMES: enforce naming contract
Debug workflow
- Capture
GraphReport.error_codeand bucket the failure by taxonomy first. - Capture
GraphReport.repro_notefor concrete context and built-in hint. - Capture pipeline text:
Graph::describe_backend()orlast_pipeline(). - Capture structured diagnostics:
MeasureReport::to_text()orNeatError::report(). - Inspect
GraphReport.busfor first terminalERRORsource + detail. - If runtime stalls/timeouts, enable boundary/element probes to localize flow stop.
Recommended support bundle:
error_coderepro_note- full
pipeline_string - first 3-5 terminal bus errors (
GraphReport.bus) - environment overrides used in run/validate
Common failures → fixes
| Symptom | Likely cause | Fix |
|---|---|---|
missing ... plugin | GStreamer plugin not found | Check GST_PLUGIN_PATH, run gst-inspect-1.0 <plugin> |
appsink 'mysink' not found | Missing terminal Output() | Ensure Output is the last node in run/build pipelines |
caps_override is set; renegotiation disabled | caps pinned | Remove caps_override or keep input caps fixed |
tensor caps change not supported | Tensor shape/dtype change at runtime | Keep tensor shape/dtype stable (no renegotiation) |
Debugging plugin failures
When a plugin fails, NEAT raises a NeatError whose message contains the GStreamer error and a structured debug string. Use the fields to locate the root cause quickly.
-
Read the structured fields. Look for the
debugkey/value fields in the error text:node: the failing element name in the pipelineconfig_path: JSON config file (if applicable)model_path: model/pack path (if applicable)hint: actionable fix guidancedetail: extra context such as missing keys or allocator state
See the Error Format Reference for the full list.
-
Confirm the pipeline context. Use the pipeline string from
Graph::last_pipeline()or from the error report:- Verify the
nodename appears in the pipeline. - Confirm the
config_pathexists and is readable. - For caps errors, check upstream elements that negotiate into the failing node.
- Verify the
-
Apply common fixes.
- Config errors: verify JSON syntax, required keys, and any model paths.
- Caps errors: add or fix parser elements (e.g.,
h264parse), ensure caps include required fields likeparsed=true,stream-format=byte-stream,alignment=au. - Allocator errors: ensure upstream elements use the required allocator type (system vs. simaai memory/segment).
-
Capture more diagnostics with the debug knobs above (
SIMA_GST_DOT_DIR,SIMA_GST_FLOW_DEBUG,SIMA_GST_ELEMENT_TIMINGS).
Full source
Show the complete C++ and Python programs
// Two diagnostic commands: Graph::validate and Run::start_measurement.
//
// Usage:
// tutorial_012_diagnose_a_pipeline
#include "neat.h"
#include <opencv2/core.hpp>
#include <iostream>
#include <stdexcept>
int main() {
try {
cv::Mat rgb(96, 128, CV_8UC3, cv::Scalar(22, 44, 66));
if (!rgb.isContinuous())
rgb = rgb.clone();
simaai::neat::Graph graph;
simaai::neat::InputOptions in;
in.format = "RGB";
in.width = rgb.cols;
in.height = rgb.rows;
in.depth = rgb.channels();
graph.add(simaai::neat::nodes::Input(in));
graph.add(simaai::neat::nodes::Output());
// CORE LOGIC
// validate() checks the Graph before build() and prints any caps problems.
auto report = graph.validate();
std::cout << "validate.error_code=" << report.error_code << "\n";
// Build a reusable runner and measure the caller-owned workload.
simaai::neat::RunOptions run_opt;
run_opt.output_memory = simaai::neat::OutputMemory::Owned;
auto run = graph.build(std::vector<cv::Mat>{rgb}, run_opt);
simaai::neat::MeasureOptions measure_opt;
measure_opt.title = "tutorial 011 diagnosis";
auto scope = run.start_measurement(measure_opt);
simaai::neat::TensorList out = run.run(std::vector<cv::Mat>{rgb}, /*timeout_ms=*/1000);
if (out.empty())
throw std::runtime_error("missing output tensor");
const simaai::neat::MeasureReport measured = scope.stop();
// Post-run diagnostics come from the measurement report.
std::cout << "measure.inputs_enqueued=" << measured.counters.inputs_enqueued
<< " outputs_pulled=" << measured.counters.outputs_pulled << "\n";
std::cout << "measure.text_size=" << measured.to_text().size() << "\n";
std::cout << "[OK] 012_diagnose_a_pipeline\n";
return 0;
} catch (const std::exception& e) {
std::cerr << "[FAIL] " << e.what() << "\n";
return 1;
}
}