Swift AI for macOS

Native macOS AI applications with Swift and Foundation Models

Swift AI for macOS (2026)

Building AI features into native macOS (and iOS) apps with Swift gives you two paths: call cloud LLM APIs from Swift, or run models on-device with Apple's frameworks for privacy and offline use. This guide covers both and when to choose each.

Two approaches

Cloud APIs from Swift: call OpenAI/Anthropic over HTTPS with URLSession. Most capable models, simplest to start, but needs network and sends data off-device.

On-device: run models locally via Apple's frameworks (Core ML, the Foundation Models / Apple Intelligence APIs) or local runtimes (llama.cpp via MLX/Metal). Private, offline, no per-call cost — but limited to smaller models.

Calling a cloud LLM from Swift

swift
struct ChatReq: Encodable { let model: String; let messages: [[String:String]] }func ask(_ prompt: String) async throws -> String {
    var req = URLRequest(url: URL(string: "https://api.openai.com/v1/chat/completions")!)
    req.httpMethod = "POST"
    req.setValue("Bearer \(apiKey)", forHTTPHeaderField: "Authorization")
    req.setValue("application/json", forHTTPHeaderField: "Content-Type")
    req.httpBody = try JSONEncoder().encode(
        ChatReq(model: "gpt-4o", messages: [["role":"user","content":prompt]]))
    let (data, _) = try await URLSession.shared.data(for: req)
    // decode choices[0].message.content ...
    return String(decoding: data, as: UTF8.self)
}

Use streaming (URLSession bytes / SSE) for responsive UIs — same idea as Streaming AI with SSE.

On-device with Apple frameworks

Apple Intelligence's Foundation Models framework exposes an on-device LLM to Swift apps for summarization, generation, and tool calling — private and offline, ideal for system-integrated features. Core ML runs your own converted models; MLX/llama.cpp run open models on Apple Silicon (Metal). For picking a small local model, see local LLM comparison.

Choosing

Need the most capable model / complex reasoning? Cloud API.

Privacy, offline, no per-call cost, system integration? On-device.

Hybrid: on-device for quick/private tasks, cloud for heavy lifting — a common pattern.

FAQ

Can I run an LLM fully on a Mac? Yes — via Apple's on-device models, Core ML, or MLX/llama.cpp on Apple Silicon. How do I call OpenAI from Swift? URLSession POST to the chat endpoint; stream for responsive UIs. Why on-device? Privacy, offline use, and no per-request cost — at the price of smaller models. Best of both? Hybrid: on-device for light/private tasks, cloud for the hard ones.

Summary

For AI in Swift/macOS, call cloud LLMs with URLSession for maximum capability, or use Apple's on-device frameworks (Foundation Models, Core ML, MLX) for private, offline, zero-cost inference. Many apps blend both — local for quick private tasks, cloud for heavy reasoning.

*Last updated: June 2026. Verify against Apple's developer documentation.*

Also available in 中文.