← Back to tutorials

AI Learning Roadmap 2025

Structured learning path for becoming an AI engineer in 2025

AI Learning Roadmap: The Structured Curriculum

This is the curriculum view of learning AI engineering — what to study, in what order, with checkpoints to verify you actually learned it. (The career strategy — roles, market, job-hunt sequencing — lives in How to Become an AI Engineer; this page is the syllabus that plugs into it.)

Design principles of this roadmap

  • Build-first: every phase ends with a runnable artifact, not a completed video course
  • Raw-before-framework: SDKs before LangChain — you can't debug abstractions you don't understand
  • Evaluation-early: measuring quality is taught in phase 2, not as an afterthought — it's the skill that separates hireable from hobbyist
  • Phase 0 — Prerequisites (skip what you have)

    Python (functions, typing, venvs, async basics — LLM apps are I/O-bound: why); Git/GitHub; HTTP/REST + JSON; SQL basics. *Checkpoint: you can build and call a small FastAPI service.*

    Explicitly not required to start: linear algebra, calculus, classical ML theory. They matter for the research/ML-engineer track; for application engineering they're electives (Phase 5).

    Phase 1 — LLM API fundamentals (2-3 weeks)

    Raw provider SDKs: chat completions, streaming, system prompts, structured outputs + validation, tool calling, error handling/retries (production basics). Prompt craft as engineering: the cheat-sheet patterns + why prompts behave like code. Understand the model landscape and tier economics. *Checkpoint: a CLI/notebook tool that calls an LLM with streaming, validated JSON output, and a tool call — no frameworks.*

    Phase 2 — RAG + evaluation (3-4 weeks)

    Embeddings and similarity; chunking; the full retrieval pipeline; pgvector as the first vector store; hybrid search and reranking as measured experiments. And the eval layer, same phase: build a 50-100 question eval set, score groundedness, make one improvement and *show the delta* (workflow). *Checkpoint: deployed RAG app over a corpus you know, with an eval table in the README.*

    Phase 3 — Production engineering (2-3 weeks)

    Streaming endpoints; async + queues for bulk work (batch APIs); cost instrumentation; fallbacks; observability (tooling); security basics (injection, PII, compliance awareness). *Checkpoint: your Phase-2 app survives a provider outage (fallback), shows cost-per-query, and has traces.*

    Phase 4 — Agents and orchestration (3-4 weeks)

    Tool design; agent loops from scratch (one time, ~100 lines — the demystifier); then LangGraph for state/persistence/human-in-the-loop; when agents are the wrong tool; multi-agent patterns as reading. *Checkpoint: an agent that does consequential work behind an approval gate, with bounded budget and persistent state.*

    Phase 5 — Electives by destination

  • Infra-curious: inference serving, KV cache internals, local models
  • Model-curious: fine-tuning with LoRA, RLHF/DPO concepts, and *now* the math foundations
  • Multimodal: vision pipelines, OCR, audio
  • Product: SaaS architecture patterns, system-design interview prep
  • How to study (the part that determines whether this works)

  • Cap course time at 30% — the other 70% is building the checkpoints
  • One project that evolves through phases 2-4 beats four disconnected demos — and becomes your portfolio centerpiece
  • Write as you go: a short post per phase on what surprised you — the habit that compounds into being findable
  • Total realistic timeline: 3-4 months part-time to Phase-4 checkpoint with prior programming experience; double without it (spend the difference in Phase 0, not later)

  • *Last updated: June 2026.*

    Also available in 中文.