On-Device AI for Flutter Applications
Run LLMs, Speech-to-Text, Text-to-Speech, and Voice AI pipelines locally—privacy-first, offline-capable, production-ready.
<think>...</think> patterns)flutter_ttsEventBusSDKLogger| Component | Minimum | Recommended |
|---|---|---|
| Flutter | 3.10.0+ | 3.24.0+ |
| Dart | 3.0.0+ | 3.5.0+ |
| iOS | 15.1+ | 16.0+ |
| Android | API 24 (7.0) | API 28+ |
| Xcode | 15.0+ | 15.0+ |
| RAM | 2GB | 4GB+ for larger models |
| Storage | Variable | Models: 100MB–8GB |
Note: ARM64 devices are recommended for best performance. Metal GPU acceleration on iOS and NEON SIMD on Android provide significant speedups over CPU-only inference.
Add the packages you need to your pubspec.yaml:
Core + LlamaCpp (LLM):
dependencies:
runanywhere: ^0.19.13
runanywhere_llamacpp: ^0.19.13
Core + ONNX (STT/TTS/VAD):
dependencies:
runanywhere: ^0.19.13
runanywhere_onnx: ^0.19.13
All Backends (LLM + STT + TTS + VAD):
dependencies:
runanywhere: ^0.19.13
runanywhere_llamacpp: ^0.19.13
runanywhere_onnx: ^0.19.13
Then run:
flutter pub get
After adding the packages, update your iOS Podfile:
1. Update ios/Podfile:
# Set minimum iOS version to 15.1
platform :ios, '15.1'
target 'Runner' do
# REQUIRED: Add static linkage
use_frameworks! :linkage => :static
flutter_install_all_ios_pods File.dirname(File.realpath(__FILE__))
end
post_install do |installer|
installer.pods_project.targets.each do |target|
flutter_additional_ios_build_settings(target)
target.build_configurations.each do |config|
config.build_settings['IPHONEOS_DEPLOYMENT_TARGET'] = '15.1'
# Required for microphone permission (STT/Voice features)
config.build_settings['GCC_PREPROCESSOR_DEFINITIONS'] ||= [
'$(inherited)',
'PERMISSION_MICROPHONE=1',
]
end
end
end
Important: Without
use_frameworks! :linkage => :static, you will see “symbol not found” errors at runtime.
2. Update ios/Runner/Info.plist:
Add microphone permission for STT/Voice features:
<key>NSMicrophoneUsageDescription</key>
<string>This app needs microphone access for speech recognition</string>
3. Run pod install:
cd ios && pod install && cd ..
Add microphone permission to android/app/src/main/AndroidManifest.xml:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
import 'package:runanywhere/runanywhere.dart';
import 'package:runanywhere_llamacpp/runanywhere_llamacpp.dart';
import 'package:runanywhere_onnx/runanywhere_onnx.dart';
void main() async {
WidgetsFlutterBinding.ensureInitialized();
// 1. Initialize SDK (development mode - no API key needed)
await RunAnywhere.initialize();
// 2. Register backend modules
await LlamaCpp.register(); // LLM backend (GGUF models)
await Onnx.register(); // STT/TTS backend (Whisper, Piper)
print('RunAnywhere SDK initialized: v${RunAnywhere.version}');
runApp(const MyApp());
}
// Register an LLM model
RunAnywhere.models.register(
id: 'smollm2-360m-q8_0',
name: 'SmolLM2 360M Q8_0',
url: Uri.parse('https://huggingface.co/prithivMLmods/SmolLM2-360M-GGUF/resolve/main/SmolLM2-360M.Q8_0.gguf'),
framework: InferenceFramework.INFERENCE_FRAMEWORK_LLAMA_CPP,
memoryRequirement: 500000000,
);
// Register an STT model
RunAnywhere.models.register(
id: 'sherpa-onnx-whisper-tiny.en',
name: 'Whisper Tiny English',
url: Uri.parse('https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-models-v1/sherpa-onnx-whisper-tiny.en.tar.gz'),
framework: InferenceFramework.INFERENCE_FRAMEWORK_SHERPA,
modality: ModelCategory.MODEL_CATEGORY_SPEECH_RECOGNITION,
);
// Register a TTS voice
RunAnywhere.models.register(
id: 'vits-piper-en_US-lessac-medium',
name: 'Piper US English',
url: Uri.parse('https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-models-v1/vits-piper-en_US-lessac-medium.tar.gz'),
framework: InferenceFramework.INFERENCE_FRAMEWORK_SHERPA,
modality: ModelCategory.MODEL_CATEGORY_SPEECH_SYNTHESIS,
);
// Download with progress
final progressStream = RunAnywhere.downloads.start('smollm2-360m-q8_0');
await for (final p in progressStream) {
print('Stage: ${p.stage}, progress: ${(p.stageProgress * 100).toStringAsFixed(1)}%');
if (p.stage == DownloadStage.DOWNLOAD_STAGE_COMPLETED) break;
}
// Load the model
await RunAnywhere.llm.load('smollm2-360m-q8_0');
print('Model loaded: ${RunAnywhere.isLLMModelLoaded}');
// Non-streaming with full metrics
final result = await RunAnywhere.llm.generate(
'Explain quantum computing in simple terms',
LLMGenerationOptions(
maxTokens: 200,
temperature: 0.7,
),
);
print('Response: ${result.text}');
print('Speed: ${result.tokensPerSecond.toStringAsFixed(1)} tok/s');
final stream = RunAnywhere.llm.generateStream(
'Write a short poem about AI',
LLMGenerationOptions(maxTokens: 150),
);
// Display tokens in real-time
await for (final event in stream) {
if (event.isFinal) break;
if (event.token.isNotEmpty) {
stdout.write(event.token);
}
}
// Load STT model
await RunAnywhere.stt.load('sherpa-onnx-whisper-tiny.en');
// Transcribe audio data (PCM16 at 16kHz mono)
final result = await RunAnywhere.stt.transcribe(audioBytes);
print('Text: ${result.text}');
print('Confidence: ${result.confidence}');
// Load TTS voice
await RunAnywhere.tts.loadVoice('vits-piper-en_US-lessac-medium');
// Synthesize speech
final ttsResult = await RunAnywhere.tts.synthesize(
'Hello! Welcome to RunAnywhere.',
TTSOptions(rate: 1.0, pitch: 1.0),
);
// ttsResult.audio is Uint8List PCM16
// ttsResult.sampleRate is typically 22050 Hz
// Ensure all components are loaded
await RunAnywhere.stt.load('sherpa-onnx-whisper-tiny.en');
await RunAnywhere.llm.load('smollm2-360m-q8_0');
await RunAnywhere.tts.loadVoice('vits-piper-en_US-lessac-medium');
// Initialize voice pipeline with the loaded models
await RunAnywhere.voice.initializeWithLoadedModels();
// Subscribe to the voice event stream
final sub = RunAnywhere.voice.eventStream().listen((event) {
if (event.hasUserSaid()) {
print('User: ${event.userSaid.text}');
} else if (event.hasAssistantToken()) {
stdout.write(event.assistantToken.text);
}
});
// Cancel when done
await sub.cancel();
The RunAnywhere Flutter SDK follows a modular, provider-based architecture with a C++ commons layer for cross-platform performance:
┌─────────────────────────────────────────────────────────────────┐
│ Your Flutter Application │
├─────────────────────────────────────────────────────────────────┤
│ RunAnywhere Flutter SDK │
│ ┌──────────────┐ ┌───────────────┐ ┌──────────────────────┐ │
│ │ Public APIs │ │ EventBus │ │ ModelRegistry │ │
│ │ (generate, │ │ (events, │ │ (model discovery, │ │
│ │ transcribe) │ │ lifecycle) │ │ download) │ │
│ └──────────────┘ └───────────────┘ └──────────────────────┘ │
├─────────────────────────────────────────────────────────────────┤
│ Native Bridge Layer (FFI) │
│ DartBridge → C++ Commons APIs │
├────────────┬─────────────┬──────────────────────────────────────┤
│ LlamaCpp │ ONNX │ Future Backends... │
│ Backend │ Backend │ │
│ (LLM) │ (STT/TTS) │ │
└────────────┴─────────────┴──────────────────────────────────────┘
| Component | Description |
|---|---|
| RunAnywhere | Singleton entry point providing 20 capability accessors (llm, stt, tts, vad, vlm, voice, voice, models, downloads, tools, rag, …) |
| EventBus | Pure dart:async broadcast stream for SDK events (no rxdart dependency) |
| DartBridge | FFI bridge slices to the C++ commons library (33 dart_bridge_*.dart files) |
| ModelRegistry | Model discovery, registration, and persistence via the C++ registry |
| Package | Size | Provides |
|---|---|---|
runanywhere |
~5MB | Core SDK, capability surface, registries, events |
runanywhere_llamacpp |
~15-25MB | LLM + VLM (GGUF models) |
runanywhere_onnx |
~50-70MB | STT, TTS, VAD (Sherpa/ONNX models) |
runanywhere_genie |
varies | Qualcomm Genie NPU LLM (Android/Snapdragon only) |
// Development mode (default) - no API key needed
await RunAnywhere.initialize();
// Production mode - requires API key and backend URL
await RunAnywhere.initialize(
apiKey: '<YOUR_API_KEY>',
baseURL: 'https://api.runanywhere.ai',
environment: SDKEnvironment.SDK_ENVIRONMENT_PRODUCTION,
);
| Environment | Description |
|---|---|
SDKEnvironment.SDK_ENVIRONMENT_DEVELOPMENT |
Verbose logging, local-only, no auth required |
SDKEnvironment.SDK_ENVIRONMENT_STAGING |
Testing with real services |
SDKEnvironment.SDK_ENVIRONMENT_PRODUCTION |
Minimal logging, full authentication, telemetry |
final options = LLMGenerationOptions(
maxTokens: 256, // Maximum tokens to generate
temperature: 0.7, // Sampling temperature (0.0–2.0)
topP: 0.95, // Top-p sampling parameter
stopSequences: ['END'], // Stop generation at these sequences
systemPrompt: 'You are a helpful assistant.',
);
The SDK provides comprehensive error handling through SDKError:
try {
final result = await RunAnywhere.llm.generate(
'Hello!',
LLMGenerationOptions(maxTokens: 64),
);
} on SDKException catch (error) {
// SDKException is a proto-backed unified error type with 40+ factory constructors.
// Inspect error.message and error.errorCode (proto enum) to branch.
print('SDK error [${error.errorCode}]: ${error.message}');
}
| Category | Description |
|---|---|
general |
General SDK errors |
llm |
LLM generation errors |
stt |
Speech-to-text errors |
tts |
Text-to-speech errors |
voice |
Voice pipeline errors |
download |
Model download errors |
validation |
Input validation errors |
// Subscribe to all SDK events
RunAnywhere.events.allEvents.listen((event) {
print('Event: $event');
});
| Event | Description |
|---|---|
SDKInitializationStarted |
SDK initialization began |
SDKInitializationCompleted |
SDK initialized successfully |
SDKModelEvent.loadStarted |
Model loading started |
SDKModelEvent.loadCompleted |
Model loaded successfully |
SDKModelEvent.downloadProgress |
Download progress update |
| Model Size | RAM Required | Use Case |
|---|---|---|
| 360M–500M (Q8) | ~500MB | Fast, lightweight chat |
| 1B–3B (Q4/Q6) | 1–2GB | Balanced quality/speed |
| 7B (Q4) | 4–5GB | High quality, slower |
// Unload models when not in use
await RunAnywhere.llm.unload();
await RunAnywhere.stt.unload();
await RunAnywhere.tts.unloadVoice();
// Check storage before downloading
final storageInfo = await RunAnywhere.downloads.getStorageInfo();
print('Available: ${storageInfo.freeBytes} bytes');
// Delete unused models
await RunAnywhere.downloads.delete('old-model-id');
availableModels()Symptoms: Download stuck or fails with network error
Solutions:
Symptoms: App crashes during model loading or inference
Solutions:
Symptoms: Runtime crash with “symbol not found” error
Solutions:
use_frameworks! :linkage => :static in Podfilecd ios && pod install --repo-updateflutter clean && flutter runSymptoms: UnsatisfiedLinkError or library load failure
Solutions:
jniLibs folder contains .so files./scripts/build/build-core-android.sh <ABI> from the repo rootSymptoms: modelNotFound error even though download completed
Solutions:
await RunAnywhere.refreshModelRegistry() to refresh the registryA: Only for initial model download. Once downloaded, all inference runs 100% on-device with no network required.
A: Varies by model:
A: No. All inference happens on-device. Only anonymous analytics (latency, error rates) are collected in production mode, and this can be disabled.
A: iOS 15.1+ and Android API 24+. ARM64 devices are recommended for best performance.
A: Yes! Any GGUF model works with LlamaCpp backend. ONNX models work for STT/TTS with the appropriate format.
A: The SDK supports both arm64 and x86_64 simulators, but performance will be significantly slower than physical devices.
Contributions are welcome. This section explains how to set up your development environment to build the SDK from source and test your changes with the sample app.
The SDK depends on native C++ libraries from runanywhere-commons. The setup script builds these locally so you can develop and test the SDK end-to-end.
# 1. Clone the repository
git clone https://github.com/RunanywhereAI/runanywhere-sdks.git
cd runanywhere-sdks
# 2. Build the native artifacts (from repo root)
./sdk/runanywhere-swift/scripts/build-core-xcframework.sh # iOS XCFrameworks
./scripts/build/build-core-android.sh arm64-v8a # Android .so files
# 3. Bootstrap the Flutter workspace
cd sdk/runanywhere-flutter
melos bootstrap
What the build scripts do:
sdk/runanywhere-swift/scripts/build-core-xcframework.sh builds RACommons, RABackendLLAMACPP,
RABackendONNX, and RABackendSherpa XCFrameworks and stages them into
sdk/runanywhere-flutter/packages/*/ios/Frameworks/.scripts/build/build-core-android.sh <ABI> builds librac_commons.so +
per-backend .so libraries and stages them into
sdk/runanywhere-flutter/packages/*/android/src/main/jniLibs/<ABI>/.| Mode | Description |
|---|---|
| Local (default) | runanywhere.useLocalNatives=true in gradle.properties; iOS podspecs vendor local Frameworks/ |
| Remote | CI override -Prunanywhere.useLocalNatives=false; downloads from GitHub Releases |
The recommended way to test SDK changes is with the sample app:
# 1. Ensure SDK is set up (from previous step)
# 2. Navigate to the sample app
cd ../../examples/flutter/RunAnywhereAI
# 3. Install dependencies
flutter pub get
# 4. Run on iOS
cd ios && pod install && cd ..
flutter run
# 5. Or run on Android
flutter run
You can open the sample app in Android Studio or VS Code for development.
The sample app’s pubspec.yaml uses path dependencies to reference the local SDK packages:
Sample App → Local Flutter SDK Packages → Local Frameworks/JNI libs
↑
Built by scripts/build/build-core-*.sh
After modifying Dart SDK code:
flutter run.After modifying runanywhere-commons (C++ code):
# From repo root
./sdk/runanywhere-swift/scripts/build-core-xcframework.sh
./scripts/build/build-core-android.sh arm64-v8a
| Script | Description |
|---|---|
sdk/runanywhere-swift/scripts/build-core-xcframework.sh |
iOS: builds RACommons/RABackendLLAMACPP/RABackendONNX/RABackendSherpa XCFrameworks and stages into Flutter packages’ ios/Frameworks/. |
scripts/build/build-core-android.sh <ABI> |
Android: builds backend .so files and stages into Flutter packages’ android/src/main/jniLibs/<ABI>/. |
sdk/runanywhere-web/scripts/build-core-wasm.sh |
(Not used by Flutter; targets the Web SDK.) |
sdk/runanywhere-flutter/scripts/package-sdk.sh |
Validate all 4 Flutter packages via pub publish --dry-run. |
We follow standard Dart style guidelines:
# Format code
dart format lib/ test/
# Analyze code
flutter analyze
# Fix issues automatically
dart fix --apply
git checkout -b feature/my-featureflutter testflutter analyzeOpen an issue on GitHub with:
RunAnywhere.versionflutter --versionApache License 2.0 — See LICENSE for details.
For commercial licensing inquiries, contact san@runanywhere.ai.