AkiraClaw is the on-device ML inference subsystem for AkiraOS. It exposes a three-function WASM API (aiinfer_load / aiinfer_run / aiinfer_unload) backed by TFLite Micro, gated behind the ai.infer capability in the app manifest.
Quick-start
1 — Declare the capability in manifest.json
{
"name": "my_ai_app",
"version": "1.0.0",
"capabilities": ["display.write", "sensor.read", "ai.infer", "memory"],
"memory_quota": 131072
}
2 — Call the API in main.c
#include "akira_api.h"
extern const unsigned char g_model[]; /* linked-in model bytes */
extern const unsigned int g_model_len;
int main(void)
{
int handle = aiinfer_load(g_model, (int)g_model_len);
if (handle < 0) { /* handle errors */ return -1; }
int8_t input[INPUT_SIZE];
int8_t output[OUTPUT_SIZE];
/* ... fill input ... */
int ret = aiinfer_run(handle, input, INPUT_SIZE, output, OUTPUT_SIZE);
aiinfer_unload(handle);
return ret;
}
3 — Pack the model into the .akpkg
# Compile
clang -O2 -nostdlib -Wl,--no-entry -Wl,--export=main -Wl,--allow-undefined \
main.c -o my_ai_app.wasm
# Bundle model with app
akira-cli pack my_ai_app.wasm manifest.json --model model.tflite
# Sign
akira-cli sign my_ai_app.akpkg --key privkey.pem
API Reference
int aiinfer_load(const void *model, int model_size)
Loads a TFLite flatbuffer into an inference slot. The model is copied internally; the caller may free the source buffer after this call returns.
Returns: Non-negative slot handle (0–3) on success.
| Error code | Value | Meaning |
|---|---|---|
AIINFER_ERR_NOMEM | -1 | Arena or model copy OOM |
AIINFER_ERR_INVALID | -2 | Bad pointer / schema mismatch |
AIINFER_ERR_NOSLOT | -4 | All AKIRA_AIINFER_MAX_HANDLES slots occupied |
Kconfig: slot limit is CONFIG_AKIRA_AIINFER_MAX_HANDLES (default 4).
int aiinfer_run(int handle, const void *input, int input_size, void *output, int output_size)
Runs one inference pass. The function:
- Copies
input_sizebytes into the model’s input tensor. - Calls
MicroInterpreter::Invoke(). - Copies the output tensor bytes into
output.
input_size must exactly equal the model’s input tensor byte count. output_size must be ≥ the model’s output tensor byte count.
Returns: 0 on success.
| Error code | Value | Meaning |
|---|---|---|
AIINFER_ERR_INVALID | -2 | Bad handle or Invoke() failed |
AIINFER_ERR_SHAPE | -3 | input/output size mismatch |
void aiinfer_unload(int handle)
Releases the slot, destroys the interpreter, and frees the model copy. The handle must not be used after this call.
Model Packaging
The .akpkg format is a gzip-compressed tar archive. When you pass --model model.tflite to akira-cli pack, the model is added as the model.tflite tar entry and its bytes are included in the signing digest:
SHA-256(manifest.json || app.wasm || model.tflite)
On installation, the firmware copies model.tflite to /lfs/apps/<app_name>/model.tflite on the device filesystem, from where the app loads it at runtime via storage_open.
Supported model formats
| Format | Notes |
|---|---|
| INT8-quantized TFLite flatbuffer | Recommended — smallest RAM, fastest on all targets |
| FLOAT32 TFLite flatbuffer | Works but uses more arena RAM and is slower |
Models must use operators registered in akira_aiinfer_api.cpp. The default build registers: DEPTHWISE_CONV_2D, CONV_2D, FULLY_CONNECTED, SOFTMAX, RESHAPE, AVERAGE_POOL_2D, MAX_POOL_2D, QUANTIZE, DEQUANTIZE. Add more via the resolver in the .cpp file if needed.
Target RAM & Latency Guide
Tensor arena size is set by CONFIG_AKIRA_AIINFER_ARENA_KB (default 128 KiB). Reduce it on RAM-constrained targets; increase it for larger models.
| Target | Accelerator | DS-CNN KWS latency (est.) | Min arena |
|---|---|---|---|
| ESP32-S3 + ESP-NN | ESP-NN SIMD (CONFIG_AKIRA_TFLITE_ESPNN=y) | ~40 ms | 80 KiB |
| STM32H7 + CMSIS-NN | CMSIS-NN MVE (CONFIG_AKIRA_TFLITE_CMSIS_NN=y) | ~25 ms | 80 KiB |
| nRF54L15 + CMSIS-NN | CMSIS-NN DSP | ~60 ms | 64 KiB |
| Generic Cortex-M4 | Reference kernels only | ~150 ms | 64 KiB |
Acceleration is auto-selected by Kconfig defaults based on SOC_SERIES / CPU_CORTEX_M symbols — no manual configuration needed on known boards.
Arena Sizing
The tensor arena must fit:
- All input/output tensors simultaneously
- All intermediate activation tensors
A safe rule of thumb: arena ≥ 2× the largest model layer activation size. For KWS/audio models (DS-CNN family), 80–128 KiB covers most cases. For image-classification models (MobileNetV1 96×96 INT8), plan for 256–512 KiB.
Set arena size in prj.conf:
CONFIG_AKIRA_AIINFER_ARENA_KB=128
Keyword Spotting Reference App
AkiraSDK/wasm_apps/console_apps/kws_demo/ is the canonical AkiraClaw demo.
It detects the “hey akira” wake word and calls app_switch("shell"):
Pipeline:
mic → 16 kHz ADC capture (1 s window)
→ log-mel spectrogram [49 × 40] INT8
→ DS-CNN model (INT8, ~20 KiB)
→ 2-class softmax [hey_akira, background]
→ confirm N consecutive frames above threshold
→ app_switch("shell")
Build and deploy:
cd AkiraSDK/wasm_apps/console_apps/kws_demo
# Provide a DS-CNN KWS model quantized to INT8 (Google Speech Commands dataset)
make pack MODEL=/path/to/ds_cnn_kws_int8.tflite
make sign KEY=/path/to/privkey.pem
akira-cli install kws_demo.akpkg --device 192.168.1.42:8080 --token <token>
The model is not checked in to this repo. Train one with the TFLite Micro micro_speech example or convert any Google Speech Commands DS-CNN checkpoint using tflite_convert with --inference_type=INT8.
Threat Model & Security Notes
See docs/ai/security-model.md (TODO) for the full threat model. Key points:
- Model integrity: the model is covered by the Ed25519 package signature. A tampered model causes signature verification failure at install time.
- Input isolation: WASM memory bounds are enforced by WAMR. A malicious app cannot pass a pointer outside its sandbox to
aiinfer_run. - Capability gating:
ai.infermust be declared in the manifest. Apps without this capability receive-EPERMfrom allaiinfer_*calls, and the denial is written to the audit log. - Arena isolation: each slot has its own static arena; slots cannot access each other’s tensor data.
- Side-channel leakage: timing side-channels on inference output are the app developer’s responsibility. For medical/industrial use, consider adding random delay jitter around
aiinfer_runcalls.