Local RAG with Ollama and Vanilla JavaScript

After some time, I decided to build a lightweight UI for my local LLM experiments. The goal was simple: a single-file web interface that supports RAG (Retrieval-Augmented Generation) without the bloat of modern frameworks.

The result is Bender Enterprise v6.1 – a minimalist, responsive chat interface that runs 100% locally.

Key Features:

True Local RAG: Extract text from PDF, DOCX, and XLSX files directly in the browser and use it as context for Ollama.
Pure Web Technologies: Built with Tailwind CSS and Vanilla JS. No node_modules, no build steps.
Persistent State: Conversation history and document context are saved in localStorage.
Witty Persona: Integrated with a custom system prompt to give the AI a bit of a “Bender” (Futurama) personality.

How it works

The UI communicates with the Ollama API (running on localhost:11434). When you upload a document, the script parses the content (using libraries like PDF.js and Mammoth) and injects it into the prompt. This allows the local model to answer questions based strictly on your private data.

Quick Setup

If you have Python installed, you can launch the interface with a simple one-liner:

Bash

python3 -m http.server 8000

The code is open-source and available on GitHub. Feel free to “shred” some documents with it!

GitHub Repository: dbunic/bender-local-rag

Key Features:

How it works

Quick Setup

Leave a Comment