← Back to tutorials

Electron AI Desktop Apps: Complete Integration Guide

Building AI-powered desktop applications with Electron

Electron AI Desktop Apps: Integration Guide

Electron remains the fastest path from web-stack team to desktop AI app — and several of the most-used AI desktop products ship on it (ChatGPT's desktop app, Claude's desktop app, Cursor among them). The framework's costs (bundle size, RAM) are real but well-understood; this guide focuses on the AI-specific architecture: where API calls live, how to stream into your renderer, and the local-model story.

The architecture rule: AI calls live in the main process

Electron gives you a Node.js main process and Chromium renderer processes. Two non-negotiables for AI apps:

  • API keys never enter the renderer — it's a web page; anything there is extractable. Keys live in the main process, stored via safeStorage (OS-level encryption), and renderers request completions over IPC.
  • Renderers get a narrow, typed bridgecontextIsolation: true (default) plus a preload script exposing exactly the AI operations you support, nothing more.
  • typescript
    // main.ts
    import { ipcMain, safeStorage } from 'electron';
    import OpenAI from 'openai';

    ipcMain.handle('ai:ask', async (_e, prompt: string) => { const client = new OpenAI({ apiKey: loadKey() }); // decrypted via safeStorage const resp = await client.chat.completions.create({ model: 'gpt-4o-mini', messages: [{ role: 'user', content: prompt }], }); return resp.choices[0].message.content; });

    typescript
    // preload.ts — the only surface the renderer sees
    import { contextBridge, ipcRenderer } from 'electron';
    contextBridge.exposeInMainWorld('ai', {
      ask: (prompt: string) => ipcRenderer.invoke('ai:ask', prompt),
      onToken: (cb: (t: string) => void) => ipcRenderer.on('ai:token', (_e, t) => cb(t)),
    });
    

    Streaming tokens over IPC

    invoke returns once; for token streaming, push events from main to renderer:

    typescript
    // main.ts
    ipcMain.handle('ai:askStream', async (event, prompt: string) => {
      const stream = await client.chat.completions.create({
        model: 'gpt-4o-mini',
        messages: [{ role: 'user', content: prompt }],
        stream: true,
      });
      for await (const chunk of stream) {
        if (event.sender.isDestroyed()) break;            // window closed — stop paying
        const token = chunk.choices[0]?.delta?.content ?? '';
        if (token) event.sender.send('ai:token', token);
      }
      event.sender.send('ai:done');
    });
    

    The renderer subscribes via the preload bridge and appends tokens — same UX as SSE in a web app, no HTTP layer needed. The isDestroyed() check is the desktop equivalent of the disconnect check in our FastAPI streaming recipe: without it, closed windows keep billing you.

    Local models: Ollama sidecar or node bindings

  • Talk to Ollama (recommended): main process calls localhost:11434, detects/launches Ollama as needed (child_process.spawn), streams via the same IPC pattern. Model management UX (which models, pull progress, disk usage) is yours to build — Ollama vs LM Studio vs Jan covers what users expect.
  • Embed via node-llama-cpp: llama.cpp bindings run GGUF models in-process with GPU support — fully self-contained apps, at the cost of shipping model weights and owning hardware variance. Run inference in a worker thread or utility process so the main process stays responsive.
  • Electron vs Tauri for AI apps, honestly

    ElectronTauri

    Team skillsAll JS/TSRust core required Bundle / RAM~100MB+, heavierMBs, lighter RenderingIdentical Chromium everywhereSystem webview (varies per OS) Local inferencenode-llama-cpp / sidecarRust bindings (llama.cpp, Candle) — stronger fit Ecosystem maturityVery deep (updater, crash reporting, signing)Good and growing

    Honest rule: all-web-stack team shipping fast → Electron; binary size/RAM as product values, or Rust on the team → Tauri. For AI specifically, Electron's renderer consistency matters more than usual — AI UIs lean on modern CSS/canvas, and debugging webview differences across three OSes is time not spent on the product.

    Production notes

  • Auto-update (electron-updater) is mandatory for AI apps — models, prompts, and provider APIs change monthly; keep app updates and any local-model downloads separate.
  • Offline behavior: detect connectivity and degrade visibly (queue requests, or fall back to a local model) — desktop users expect apps to work on planes.
  • Spend visibility: usage accrues per installed user on your key (or theirs) — track per-user token counts and expose them in-app; surprise bills kill desktop AI products.
  • FAQ

    Can I use the Vercel AI SDK in the renderer? UI hooks yes — but route actual provider calls through main-process IPC so keys stay out of the renderer. Don't call providers directly from renderer code even with a user-supplied key, unless you're comfortable with that key living in a web context.

    Voice features? whisper.cpp via node bindings (or an Ollama-adjacent server) in the main/utility process; stream transcription results over the same IPC events.


    *Last updated: June 2026.*

    Also available in 中文.