Client-side • Free

Turn any GitHub repo into a chat you can query

Browser-native RAG. Embeddings, retrieval, and storage, all on-device via WebGPU. No server. API keys encrypted locally.

Evals
WebGPU InferenceEmbeddings computed on your GPU via WebGPU
🌲AST ChunkingCode split by syntax, not line count
🔍Hybrid SearchCombines vector and keyword search
🗜Binary Quantization32x smaller index. 750KB → 23KB for 500 chunks.
🔐Encrypted Key VaultAPI keys stored locally, locked by passkey

How It Works

01Input

URL Trigger

proxy.ts validates owner/repo and starts local processing.

02Ingestion

GitHub Fetch

Repository tree and source blobs are pulled into browser memory.

03Parse

AST Chunker

tree-sitter WASM segments code into semantically stable chunks.

04Compute

Embedding Pipeline

transformers.js runs WebGPU embeddings for chunk vectors.

05Optimize

Binary Quantisation

Vectors are compressed for fast local similarity operations.

06Persist

IndexedDB

IndexedDB stores vectors, metadata, and symbol structure locally.

07Retrieval

Hybrid Search

Hamming distance and regex matching are fused for recall.

08Selection

Matryoshka Reranker

Candidate chunks are reranked before prompt construction.

09Inference

WebLLM Worker

Qwen2-0.5B generates grounded answers in a dedicated worker.

10Output

Chat UI + CoVe

Answer and verification loop are streamed to the interface.