Skip to main content

Beginner Tutorials

Use these tutorials in order. Each card links to a chapter with concept-first guidance and matching C++ and Python implementation.

Load a compiled ResNet-50 archive, feed it an image, and read the top-1 class — the shortest path from "I have a model archiv...

modelinferencefoundations

Feed a model from a producer thread while consuming predictions from another, decoupling input and output for real throughput...

asyncpush-pullthroughputruntime
Benchmark Your Model image Benchmark Your Model 5-10 minutes

Run a compiled model with deterministic synthetic tensors and print the headline latency, throughput, power, and energy numbe...

benchmarksyntheticlatencythroughputpower

Compose a `Graph` by hand — input node, output node, no model — and run one frame through it. See the pipeline primitives in...

graphbuildrunpipeline

`ModelOptions` is the one struct that declares the contract between your input data, the model's pipeline stages, and its out...

model-optionsconfigurationcontracts
Run an LLM image Run an LLM 10 minutes

Load a GenAI model directory, send one simple prompt, add a system prompt, then grow the same pattern into chat history and s...

genaillmchathistorystreaming
Run a VLM image Run a VLM 10-15 minutes

Ask repeated questions about the same image without re-encoding the image for every request.

genaivlmimagecachemultimodal