Swift AI for macOS
Native macOS AI applications with Swift and Foundation Models
Swift AI for macOS (2026)
Building AI features into native macOS (and iOS) apps with Swift gives you two paths: call cloud LLM APIs from Swift, or run models on-device with Apple's frameworks for privacy and offline use. This guide covers both and when to choose each.
Two approaches
URLSession. Most capable models, simplest to start, but needs network and sends data off-device.Calling a cloud LLM from Swift
swift
struct ChatReq: Encodable { let model: String; let messages: [[String:String]] }func ask(_ prompt: String) async throws -> String {
var req = URLRequest(url: URL(string: "https://api.openai.com/v1/chat/completions")!)
req.httpMethod = "POST"
req.setValue("Bearer \(apiKey)", forHTTPHeaderField: "Authorization")
req.setValue("application/json", forHTTPHeaderField: "Content-Type")
req.httpBody = try JSONEncoder().encode(
ChatReq(model: "gpt-4o", messages: [["role":"user","content":prompt]]))
let (data, _) = try await URLSession.shared.data(for: req)
// decode choices[0].message.content ...
return String(decoding: data, as: UTF8.self)
}
Use streaming (URLSession bytes / SSE) for responsive UIs — same idea as Streaming AI with SSE.
On-device with Apple frameworks
Apple Intelligence's Foundation Models framework exposes an on-device LLM to Swift apps for summarization, generation, and tool calling — private and offline, ideal for system-integrated features. Core ML runs your own converted models; MLX/llama.cpp run open models on Apple Silicon (Metal). For picking a small local model, see local LLM comparison.
Choosing
FAQ
Can I run an LLM fully on a Mac? Yes — via Apple's on-device models, Core ML, or MLX/llama.cpp on Apple Silicon.
How do I call OpenAI from Swift? URLSession POST to the chat endpoint; stream for responsive UIs.
Why on-device? Privacy, offline use, and no per-request cost — at the price of smaller models.
Best of both? Hybrid: on-device for light/private tasks, cloud for the hard ones.
Summary
For AI in Swift/macOS, call cloud LLMs with URLSession for maximum capability, or use Apple's on-device frameworks (Foundation Models, Core ML, MLX) for private, offline, zero-cost inference. Many apps blend both — local for quick private tasks, cloud for heavy reasoning.
*Last updated: June 2026. Verify against Apple's developer documentation.*
Also available in 中文.