This repository converts Arm Ethos-U Vela raw output into a small bare-metal C integration package:
- Vela
rawoutput (*_vela.npz) - generated driver-facing C headers and sources
- generated reference input/output arrays from the original
.tflite - optional
.txtexports for test harnesses
It also includes example generated outputs, Ethos-U driver snapshots, and NeuralSPOT-oriented integration examples.
Given a .tflite model, the pipeline can generate:
*_vela.npz: Vela raw output*_cmd_data.h: command stream payload for the driver*_weights.h: model weights blob*_meta.h: tensor region and size metadata*_buffers.hand*_buffers.c: scratch and tensor region allocations*_run.c: minimal direct-driver invocation example*_data.h: reference input/output arrays from TFLite inferencesrc/*_input.txt,src/*_golden_output.txt,src/*_weights.txt,src/*_cmd_data.txt: one-value-per-line text dumps
- Only single-command-stream models are supported.
- Models that require CPU fallback are not supported by
python/vela_raw_to_c.py. - The helper inference script currently assumes a single input tensor and a single output tensor.
- The generated code is intended for direct Ethos-U driver integration, not TFLM.
- GCC-based embedded builds are the documented path in this repository.
- The examples here are compile-oriented; hardware validation is not documented in this repo.
- Python 3.11+
- Arm Vela CLI available as
vela - TensorFlow and NumPy for reference array generation
The repo already declares Python dependencies in pyproject.toml. A typical setup is:
uv syncIf you are not using uv, install the equivalent packages manually.
The main entrypoint is run_vela_pipeline.py. It runs:
- Vela with
--output-format raw python/vela_raw_to_c.pypython/generate_c_arrays.pypython/array_2_txt.py
python3 run_vela_pipeline.py example_models/kws_ref_model/kws_ref_model.tfliteBy default this writes output to:
example_models/kws_ref_model/kws_ref_model_output/
python3 run_vela_pipeline.py example_models/kws_ref_model/kws_ref_model.tflite \
--output-dir output/kws \
--accelerator-config ethos-u85-256 \
--vela-config config/ambiq_final.ini \
--system-config AmbiqLP_SRAM \
--memory-mode Sram_Only_256KB--output-dir: output directory for generated artifacts--vela-config: Vela.inifile--accelerator-config: NPU target, for exampleethos-u85-256--system-config: Vela system config name from the.ini--memory-mode: Vela memory mode name from the.ini--vela-prefix: prefix for generated direct-driver C files--raw-to-c-prefix: explicit override for the raw-to-C prefix--c-arrays-output: custom path for the generated*_data.h--skip-vela: reuse an existing*_vela.npz--skip-raw-to-c: skip direct-driver C generation--skip-c-arrays: skip reference input/output generation--skip-array-to-txt: skip.txtexports--clean: remove the output directory before running
Example:
vela \
--accelerator-config ethos-u85-256 \
example_models/mobilenet_v2_1.0_224_INT8/mobilenet_v2_1.0_224_INT8.tflite \
--output-format raw \
--config config/ambiq_final.ini \
--system-config AmbiqLP_SRAM \
--memory-mode Sram_Only_256KB \
--output-dir output/mobilenetThis produces a *_vela.npz file in the selected output directory.
python/vela_raw_to_c.py turns the Vela .npz into direct-driver integration code:
python3 python/vela_raw_to_c.py \
output/mobilenet/mobilenet_v2_1.0_224_INT8_vela.npz \
--out-dir output/mobilenet \
--prefix mobilenet_v2_1_0_224_INT8python/generate_c_arrays.py runs the original TFLite model with generated random input and emits a header containing:
<model>_input<model>_output
Example:
python3 python/generate_c_arrays.py \
example_models/mobilenet_v2_1.0_224_INT8/mobilenet_v2_1.0_224_INT8.tflite \
-o output/mobilenet/mobilenet_v2_1.0_224_INT8_data.hpython/array_2_txt.py extracts relevant arrays from generated headers and writes text files suitable for external harnesses.
Example:
python3 python/array_2_txt.py \
output/mobilenet/mobilenet_v2_1.0_224_INT8_data.h \
output/mobilenet/mobilenet_v2_1.0_224_INT8_cmd_data.h \
output/mobilenet/mobilenet_v2_1.0_224_INT8_weights.h \
-o output/mobilenet/src \
--prefix mobilenet_v2_1_0_224_INT8This generates:
mobilenet_v2_1_0_224_INT8_input.txtmobilenet_v2_1_0_224_INT8_golden_output.txtmobilenet_v2_1_0_224_INT8_weights.txtmobilenet_v2_1_0_224_INT8_cmd_data.txt
python/slice_tflite.py creates prefix slices of a TFLite model by operator count. This is useful for bring-up and debugging.
python3 python/slice_tflite.py \
example_models/resnet_v1_8_32_tfs_int8/resnet_v1_8_32_tfs_int8.tflite \
--step 5This creates:
slice_1/<model>_1.tflite
slice_2/<model>_2.tflite
...
python3 python/slice_tflite.py \
example_models/resnet_v1_8_32_tfs_int8/resnet_v1_8_32_tfs_int8.tflite \
--step 5 \
--run-pipelineEach slice gets its own output/ directory under the slice folder.
The config/ directory contains sample Vela configuration files, including:
Use the file and the matching --system-config and --memory-mode names that correspond to your target.
This repository also contains:
- Ethos-U driver snapshots under
ethos-u-core-driver-realandbobby_ethos-u-core-driver-real - example integration projects under
example/velaandexample/vela_for_neuralspot
Those directories are useful as reference integration points; the Python pipeline itself does not depend on building them.
config/ Vela configuration files
example/ integration examples
example_models/ input models and sample generated outputs
performance/ performance notes/data
python/ helper scripts
run_vela_pipeline.py end-to-end pipeline runner