Open WebUI + Ollama: Build Your Own Private ChatGPT, Data Never Leaks
Deploy a complete local AI chat system in 30 minutes
Many people hesitate to put company files, contracts, or code into ChatGPT or Claude—yet they need AI help to process these materials.
The Open WebUI + Ollama combination solves this dilemma: you get a ChatGPT-level user experience, but all data is processed only on your own computer.
1. System Requirements
2. Complete Installation Guide
Step 1: Install Ollama
bash
macOS / Linux
curl -fsSL https://ollama.com/install.sh | shDownload recommended models
ollama pull qwen2.5:7b # Chinese tasks (main)
ollama pull llama3.2:3b # Quick tasks
ollama pull qwen2.5-coder:7b # Code tasks
Step 2: Install Open WebUI (Docker method)
bash
docker run -d \
-p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
-e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
-e WEBUI_AUTH=true \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Visit http://localhost:3000. On first visit, you need to register an admin account.
Step 3: Non-Docker method (pip install)
bash
pip install open-webui
open-webui serve
3. Core Feature Configuration
3.1 Add Claude / OpenAI API (hybrid use)
If you want to use Claude or GPT alongside local models:
https://api.anthropic.com/v1, API Key: your Anthropic key
- OpenAI: https://api.openai.com/v1, API Key: your OpenAI keyThis way, you can choose between local and cloud models for different tasks within the same interface.
3.2 Document Upload and RAG
Open WebUI has built-in document processing capabilities:
Best practices for private document handling:
3.3 Create Custom Roles (System Prompt Templates)
Create a "Code Reviewer" role:
Name: Senior Code Reviewer
Model: qwen2.5-coder:7b
System Prompt: You are a TypeScript developer with 10 years of experience.
When reviewing code, focus on: type safety, potential bugs, performance issues, security vulnerabilities.
List issues in bullet points, each with a fix suggestion.
4. Team Shared Deployment
If multiple team members need access, deploy Open WebUI on a LAN server:
bash
Run on the server, bind to LAN IP
docker run -d \
-p 0.0.0.0:3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
-e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
-e WEBUI_AUTH=true \
--name open-webui \
ghcr.io/open-webui/open-webui:main
Other computers on the LAN can access it at http://.
Note: If external access is needed, you must configure HTTPS and stronger authentication (do not expose directly to the public internet).
5. FAQ
Q: Model responses are very slow. What can I do?
A: First, check if GPU acceleration is active (ollama ps). Then switch to a smaller model (3B is 2-3x faster than 7B). Alternatively, use a more quantized version.
Q: AI responses are inaccurate after document upload A: This is a common RAG issue. Try: (1) Narrow down the document scope, upload only relevant parts; (2) Be more specific when referencing document sections in your question; (3) Adjust the Top-K parameter for vector retrieval.
Q: How do I back up conversation history?
A: Open WebUI data is stored in the Docker volume open-webui. Backup method:
bash
docker run --rm -v open-webui:/source -v $(pwd):/backup ubuntu tar czf /backup/webui-backup.tar.gz -C /source .
Further Reading
Also available in 中文.