A production-grade, on-device AI SDK for iOS, macOS, tvOS, and watchOS. The SDK enables low-latency, privacy-preserving inference for large language models, speech recognition, and voice synthesis with modular backend support.
The RunAnywhere Swift SDK enables developers to run AI models directly on Apple devices without requiring network connectivity for inference. By keeping data on-device, the SDK ensures minimal latency and maximum privacy for your users.
The SDK provides a unified interface to multiple AI capabilities, including large language models (LLMs), speech-to-text (STT), text-to-speech (TTS), voice activity detection (VAD), and speaker diarization. These capabilities are delivered through pluggable backend modules that can be included as needed.
Generatable protocolEventBus| Platform | Minimum Version |
|---|---|
| iOS | 17.0+ |
| macOS | 14.0+ |
| tvOS | 17.0+ |
| watchOS | 10.0+ |
Swift Version: 5.9+
Xcode: 15.2+
Some optional modules have higher runtime requirements:
RunAnywhereAppleAI): iOS 26+ / macOS 26+ at runtimeAdd the RunAnywhere SDK to your project using Xcode:
https://github.com/RunanywhereAI/runanywhere-sdks
from: "1.0.0")dependencies: [
.package(url: "https://github.com/RunanywhereAI/runanywhere-sdks", from: "1.0.0")
],
targets: [
.target(
name: "YourApp",
dependencies: [
.product(name: "RunAnywhere", package: "runanywhere-sdks"),
.product(name: "RunAnywhereLlamaCPP", package: "runanywhere-sdks"),
.product(name: "RunAnywhereONNX", package: "runanywhere-sdks"),
]
)
]
This repository contains two Package.swift files for different use cases:
| File | Location | Purpose |
|---|---|---|
| Root Package.swift | runanywhere-sdks/Package.swift |
For external SPM consumers. Downloads pre-built XCFrameworks from GitHub releases. |
| Local Package.swift | runanywhere-sdks/sdk/runanywhere-swift/Package.swift |
For SDK developers. Uses local XCFrameworks from Binaries/ directory. |
For app developers: Use the root-level package via the GitHub URL (as shown above).
For SDK contributors: Use the local package with testLocal = true after running the setup script.
import RunAnywhere
import LlamaCPPRuntime
@main
struct MyApp: App {
init() {
Task { @MainActor in
// Register the LlamaCPP module for LLM support
LlamaCPP.register()
// Initialize the SDK
do {
try RunAnywhere.initialize(
apiKey: "<YOUR_API_KEY>",
baseURL: "https://api.runanywhere.ai",
environment: .production
)
} catch {
print("SDK initialization failed: \(error)")
}
}
}
var body: some Scene {
WindowGroup {
ContentView()
}
}
}
// Simple chat interface
let response = try await RunAnywhere.chat("What is the capital of France?")
print(response) // "The capital of France is Paris."
// Full generation with metrics
let result = try await RunAnywhere.generate(
"Explain quantum computing in simple terms",
options: LLMGenerationOptions(
maxTokens: 200,
temperature: 0.7
)
)
print("Response: \(result.text)")
print("Tokens used: \(result.tokensUsed)")
print("Speed: \(result.tokensPerSecond) tok/s")
// Load an LLM model by ID
try await RunAnywhere.loadModel("llama-3.2-1b-instruct-q4")
// Check if model is loaded
let isLoaded = await RunAnywhere.isModelLoaded
try RunAnywhere.initialize(
apiKey: "<YOUR_API_KEY>",
baseURL: "https://api.runanywhere.ai",
environment: .production
)
| Environment | Description |
|---|---|
.development |
Verbose logging, mock services, local analytics |
.staging |
Testing with real services |
.production |
Minimal logging, full authentication, telemetry |
let options = LLMGenerationOptions(
maxTokens: 100,
temperature: 0.8,
topP: 1.0,
stopSequences: ["END"],
streamingEnabled: false,
preferredFramework: .llamaCpp,
systemPrompt: "You are a helpful assistant."
)
Register modules at app startup before using their capabilities:
import RunAnywhere
import LlamaCPPRuntime
import ONNXRuntime
@MainActor
func setupSDK() {
LlamaCPP.register() // LLM (priority: 100)
ONNX.register() // STT + TTS (priority: 100)
}
let result = try await RunAnywhere.generateStream(
"Write a short poem about AI",
options: LLMGenerationOptions(maxTokens: 150)
)
for try await token in result.stream {
print(token, terminator: "")
}
let metrics = try await result.result.value
print("\nSpeed: \(metrics.tokensPerSecond) tok/s")
struct QuizQuestion: Generatable {
let question: String
let options: [String]
let correctAnswer: Int
static var jsonSchema: String {
"""
{
"type": "object",
"properties": {
"question": { "type": "string" },
"options": { "type": "array", "items": { "type": "string" } },
"correctAnswer": { "type": "integer" }
},
"required": ["question", "options", "correctAnswer"]
}
"""
}
}
let quiz: QuizQuestion = try await RunAnywhere.generateStructured(
QuizQuestion.self,
prompt: "Create a quiz question about Swift programming"
)
import RunAnywhere
import ONNXRuntime
await ONNX.register()
try await RunAnywhere.loadSTTModel("whisper-base-onnx")
let audioData: Data = // your audio data (16kHz, mono, Float32)
let transcription = try await RunAnywhere.transcribe(audioData)
print("Transcribed: \(transcription)")
try await RunAnywhere.loadTTSVoice("piper-en-us-amy")
let output = try await RunAnywhere.synthesize(
"Hello! Welcome to RunAnywhere.",
options: TTSOptions(
speakingRate: 1.0,
pitch: 1.0,
volume: 0.8
)
)
try await RunAnywhere.initializeVoiceAgent(
sttModelId: "whisper-base-onnx",
llmModelId: "llama-3.2-1b-instruct-q4",
ttsVoice: "com.apple.ttsbundle.siri_female_en-US_compact"
)
let audioData: Data = // recorded audio
let result = try await RunAnywhere.processVoiceTurn(audioData)
print("User said: \(result.transcription)")
print("AI response: \(result.response)")
await RunAnywhere.cleanupVoiceAgent()
import Combine
class ViewModel: ObservableObject {
private var cancellables = Set<AnyCancellable>()
init() {
RunAnywhere.events.events
.receive(on: DispatchQueue.main)
.sink { event in
print("Event: \(event.type)")
}
.store(in: &cancellables)
RunAnywhere.events.events(for: .llm)
.sink { event in
print("LLM Event: \(event.type)")
}
.store(in: &cancellables)
}
}
let models = try await RunAnywhere.availableModels()
let model = models.first { $0.id == "llama-3.2-1b-instruct-q4" }!
let task = try await Download.shared.downloadModel(model)
for await progress in task.progress {
let percent = Int(progress.overallProgress * 100)
print("\(progress.stage.displayName): \(percent)%")
}
The RunAnywhere SDK follows a modular, provider-based architecture that separates core functionality from specific backend implementations:
+------------------------------------------------------------------+
| Public API |
| RunAnywhere.generate() / transcribe() / synthesize() |
+------------------------------------------------------------------+
|
+------------------------------------------------------------------+
| Capability Layer |
| LLMCapability | STTCapability | TTSCapability | ... |
+------------------------------------------------------------------+
|
+------------------------------------------------------------------+
| ServiceRegistry |
| Routes requests to registered service providers |
+------------------------------------------------------------------+
|
+--------------------+--------------------+
v v v
+------------------+ +------------------+ +------------------+
| LlamaCPP Module | | ONNX Module | | AppleAI Module |
| (LLM: GGUF) | | (STT + TTS) | | (LLM: iOS 26+) |
+------------------+ +------------------+ +------------------+
| | |
v v v
+------------------------------------------------------------------+
| Native Runtime / XCFramework |
| RunAnywhereCore (C++ with Metal acceleration) |
+------------------------------------------------------------------+
Key Components:
RunAnywhere.setLogLevel(.debug)
RunAnywhere.configureLocalLogging(enabled: true)
RunAnywhere.setDebugMode(true)
await RunAnywhere.flushAll()
| Level | Description |
|---|---|
.debug |
Detailed information for debugging |
.info |
General operational information |
.warning |
Potential issues that don’t prevent operation |
.error |
Errors that affect specific operations |
.fault |
Critical errors indicating serious problems |
The SDK automatically tracks key metrics:
All SDK errors are represented by SDKError, which provides:
case notInitialized
case invalidAPIKey(String?)
case invalidConfiguration(String)
case modelNotFound(String)
case modelLoadFailed(String, Error?)
case modelIncompatible(String, String)
case generationFailed(String)
case generationTimeout(String?)
case contextTooLong(Int, Int)
case networkUnavailable
case downloadFailed(String, Error?)
case insufficientStorage(Int64, Int64)
case storageFull
do {
let result = try await RunAnywhere.generate("Hello")
} catch let error as SDKError {
switch error.code {
case .notInitialized:
print("Please call RunAnywhere.initialize() first")
case .modelNotFound:
print("Model not found. Download it first.")
case .generationFailed:
print("Generation failed: \(error.message)")
default:
print("Error: \(error.localizedDescription)")
if let suggestion = error.recoverySuggestion {
print("Suggestion: \(suggestion)")
}
}
}
// Unload models when not in use
try await RunAnywhere.unloadModel()
// Check storage before downloading
let storageInfo = await RunAnywhere.getStorageInfo()
if storageInfo.availableBytes > model.downloadSize ?? 0 {
// Safe to download
}
// Clean up temporary files periodically
try await RunAnywhere.cleanTempFiles()
let result = try await RunAnywhere.generateStream(prompt)
for try await token in result.stream {
await MainActor.run { self.text += token }
}
No, once models are downloaded, all inference happens on-device. You only need internet for:
The SDK supports:
Model sizes vary significantly:
Currently, one LLM can be loaded at a time. STT and TTS models can be loaded alongside LLM models. Use unloadModel() before loading a different LLM.
Call fetchModelAssignments(forceRefresh: true) to sync the latest model catalog. New versions can be downloaded alongside existing models.
By default, only anonymous analytics (latency, error rates) are collected. Actual prompts, responses, and audio data never leave the device.
RunAnywhere.setDebugMode(true)RunAnywhere.events.on(.error) { ... }chat(_:) returns just the text stringgenerate(_:options:) returns LLMGenerationResult with full metricsWe welcome contributions to the RunAnywhere Swift SDK. This section explains how to set up your development environment to build the SDK from source and test your changes with the sample app.
The SDK depends on native C++ libraries from runanywhere-commons. The setup script builds these locally so you can develop and test the SDK end-to-end.
# 1. Clone the repository
git clone https://github.com/RunanywhereAI/runanywhere-sdks.git
cd runanywhere-sdks/sdk/runanywhere-swift
# 2. Run first-time setup (~5-15 minutes)
./scripts/build-swift.sh --setup
What the setup script does:
RACommons.xcframework (core infrastructure)RABackendLLAMACPP.xcframework (LLM backend)RABackendONNX.xcframework (STT/TTS/VAD backend)Binaries/testLocal = true in Package.swift (enables local framework consumption)The SDK has two modes controlled by testLocal in Package.swift:
| Mode | Setting | Description |
|---|---|---|
| Local | testLocal = true |
Uses XCFrameworks from Binaries/ (for development) |
| Remote | testLocal = false |
Downloads XCFrameworks from GitHub releases (for end users) |
When you run --setup, the script automatically sets testLocal = true.
The recommended way to test SDK changes is with the sample app:
# 1. Ensure SDK is set up (from previous step)
# 2. Navigate to the sample app
cd ../../examples/ios/RunAnywhereAI
# 3. Open in Xcode
open RunAnywhereAI.xcodeproj
# 4. If Xcode shows package errors, reset caches:
# File > Packages > Reset Package Caches
# 5. Build and Run (⌘+R)
The sample app’s Package.swift references the local SDK, which in turn uses the local frameworks from Binaries/. This creates a complete local development loop:
Sample App → Local Swift SDK → Local XCFrameworks (Binaries/)
↑
Built by build-swift.sh --setup
After modifying Swift SDK code:
After modifying runanywhere-commons (C++ code):
cd sdk/runanywhere-swift
./scripts/build-swift.sh --local --build-commons
| Command | Description |
|---|---|
--setup |
First-time setup: downloads deps, builds all frameworks, sets testLocal = true |
--local |
Use local frameworks from Binaries/ |
--remote |
Use remote frameworks from GitHub releases |
--build-commons |
Rebuild runanywhere-commons from source |
--clean |
Clean build artifacts before building |
--release |
Build in release mode (default: debug) |
--skip-build |
Only setup frameworks, skip swift build |
--set-local |
Set testLocal = true in Package.swift |
--set-remote |
Set testLocal = false in Package.swift |
swift test
The project uses SwiftLint for code style enforcement:
brew install swiftlint
swiftlint
git checkout -b feature/my-featureswift testswiftlintOpen an issue on GitHub with:
RunAnywhere.version)Copyright 2025 RunAnywhere AI. All rights reserved.
See the repository for license terms. For commercial licensing inquiries, contact san@runanywhere.ai.