runanywhere-sdks

Local Browser - On-Device AI Web Automation

Launching support for runanywhere-web-sdk soon in our main repo: please go check it out: https://github.com/RunanywhereAI/runanywhere-sdks

A Chrome extension that uses WebLLM to run AI-powered web automation entirely on-device. No cloud APIs, no API keys, fully private.

Demo

https://github.com/user-attachments/assets/898cc5c2-db77-4067-96e6-233c5da2bae5

Features

Quick Start

Prerequisites

Installation

  1. Clone and install dependencies:
    cd local-browser
    npm install
    
  2. Build the extension:
    npm run build
    
  3. Load in Chrome:
    • Open chrome://extensions
    • Enable “Developer mode” (top right)
    • Click “Load unpacked”
    • Select the dist folder from this project
  4. First run:
    • Click the extension icon in your toolbar
    • The first run will download the AI model (~1GB)
    • This is cached for future use

Usage

  1. Navigate to any webpage
  2. Click the Local Browser extension icon
  3. Type a task like:
    • “Search for ‘WebGPU’ on Wikipedia and extract the first paragraph”
    • “Go to example.com and tell me what’s there”
    • “Find the search box and search for ‘AI news’”
  4. Watch the AI execute the task step by step

Development

Development Mode

npm run dev

This watches for changes and rebuilds automatically.

Project Structure

local-browser/
├── manifest.json           # Chrome extension manifest (MV3)
├── src/
│   ├── background/         # Service worker
│   │   ├── index.ts        # Entry point & message handling
│   │   ├── llm-engine.ts   # WebLLM wrapper
│   │   └── agents/         # AI agent system
│   │       ├── base-agent.ts
│   │       ├── planner-agent.ts
│   │       ├── navigator-agent.ts
│   │       └── executor.ts
│   ├── content/            # Content scripts
│   │   ├── dom-observer.ts # Page state extraction
│   │   └── action-executor.ts
│   ├── popup/              # React popup UI
│   │   ├── App.tsx
│   │   └── components/
│   └── shared/             # Shared types & constants
└── dist/                   # Build output

How It Works

  1. User enters a task in the popup UI
  2. Planner Agent analyzes the task and creates a high-level strategy
  3. Navigator Agent examines the current page DOM and decides on the next action
  4. Content Script executes the action (click, type, extract, etc.)
  5. Loop continues until task is complete or fails

Agent System

The extension uses a two-agent architecture inspired by Nanobrowser:

Both agents output structured JSON that is parsed and executed.

Model Configuration

Default model: Qwen2.5-1.5B-Instruct-q4f16_1-MLC (~1GB)

Alternative models (configured in src/shared/constants.ts):

Troubleshooting

WebGPU not supported

Model fails to load

Actions not executing

Extension not working after Chrome update

Limitations

Tech Stack

Credits

This project is inspired by:

Dependency Licenses

Package License
@mlc-ai/web-llm Apache-2.0
React MIT
Vite MIT
@crxjs/vite-plugin MIT
TypeScript Apache-2.0

License

MIT License - See LICENSE file for details.