← Back to tutorials

Complete Local AI Deployment Guide 2026: Ollama + Open WebUI + Private Knowledge Base, Zero Data Leakage Solution

Run ChatGPT-Level AI on Your Own Computer

If your work involves sensitive data—client information, financial data, internal documents—you probably shouldn't be sending this content to OpenAI's servers.

Local AI is the solution.

Hardware Requirements

Minimum Configuration (Running 7B Models)


CPU: 8+ cores
RAM: 16GB
GPU: 8GB VRAM (optional but recommended)
Storage: 50GB free space
OS: macOS / Linux / Windows

Recommended Configuration (Running 13B-70B Models)


GPU: NVIDIA 24GB+ VRAM (RTX 4090 / A6000)
      or Apple Silicon M2/M3 Pro+ (Unified Memory)
RAM: 32GB+
Storage: 200GB SSD

Apple Silicon Advantage: The unified memory architecture of M2/M3 makes MacBooks extremely efficient for running local models. An M3 Pro with 16GB RAM can smoothly run 13B models.

Installing and Configuring Ollama

Installation

bash

macOS / Linux

curl -fsSL https://ollama.ai/install.sh | sh

Or download the installer directly

https://ollama.ai/download

Pulling Common Models

bash

General conversation (7B, recommended for beginners)

ollama pull llama3.2

Code-specific (lightweight and fast)

ollama pull qwen2.5-coder:7b

Chinese-optimized (DeepSeek)

ollama pull deepseek-r1:7b

Multimodal (supports images)

ollama pull llava:13b

Ultra-lightweight (low-end machines)

ollama pull phi3:mini

Running Models

bash

Interactive command line

ollama run llama3.2

Run as an API service (default port 11434)

ollama serve

Open WebUI: ChatGPT Interface

Docker Installation (Recommended)

bash
docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Access http://localhost:3000 to use it.

Key Features

  • Multi-model switching (switch between different Ollama models)
  • Conversation history management
  • System Prompt presets
  • File upload (supports PDF/documents)
  • Image understanding (requires multimodal model)
  • Multi-user management
  • AnythingLLM: Private Knowledge Base

    Installation

    bash
    

    Docker installation

    docker pull mintplexlabs/anythingllm docker run -d -p 3001:3001 \ -v $(pwd)/anythingllm:/app/server/storage \ mintplexlabs/anythingllm

    Configuring the Knowledge Base

  • Go to Settings → LLM Provider → Select Ollama
  • Create a Workspace (knowledge base)
  • Upload documents (PDF/Word/TXT/web links)
  • Start chatting with your documents
  • Use Cases

  • Internal company document Q&A
  • Contract/regulation queries
  • Product manual intelligent customer service
  • Personal notes knowledge base
  • Model Recommendations (Latest 2026)

    Use CaseRecommended ModelVRAM Requirement

    General conversationLlama 3.3 70B (Q4)40GB General conversation (lightweight)Qwen2.5 7B8GB Code generationQwen2.5-Coder 32B20GB Chinese-specificDeepSeek-V3 (Q4)40GB Local smallPhi-4 mini4GB MultimodalLLaVA 13B16GB


    Further Reading

  • Complete Guide to Building a RAG Knowledge Base
  • Docker Containerized AI Application Deployment
  • Also available in 中文.