教程中心
AI Agent 从入门到实战:概念理解、MCP 使用、平台实操、工作流自动化
2024
教程总数
368
入门教程
45
实操教程
按主题浏览
Curriculum Data Collection: Advanced Guide
Collecting training data in strategic progressions
Curriculum Data Collection: Advanced Guide Overview Collecting training data in strategic progressions. This comprehensive guide covers everything you need to know for production implementation. Why It Matters Curriculum Data Collection: Advanced
Auto-scaling AI Inference: Production Setup Guide
Dynamic scaling of AI inference based on demand
Auto-scaling AI Inference Overview Dynamic scaling of AI inference based on demand. This guide provides practical, production-ready implementations. **Category**: ai-infrastructure **Primary Tool**: kubernetes **Tags**: infrastructure, devops,
Documentation Agent: Complete Tutorial
Agent that autonomously maintains and updates documentation
Documentation Agent Overview Agent that autonomously maintains and updates documentation. This guide covers architecture, implementation, and production deployment of AI agents. Agent Architecture ``` User Input ↓ Agent Orchestrator ↓ ┌──
Flash Attention Optimization: Technical Deep Dive
How FlashAttention speeds up transformer inference
Flash Attention Optimization: Technical Deep Dive Overview How FlashAttention speeds up transformer inference. This comprehensive guide covers everything you need to know for production implementation. Why It Matters Flash Attention Optimization:
Contextual Compression RAG: Implementation Guide with Pinecone 2026
Build a compressing retrieved context to fit LLM window RAG system from scratch
Contextual Compression RAG: Complete Implementation 2026 Overview Contextual Compression RAG is a specialized retrieval pattern that focuses on compressing retrieved context to fit LLM window. This guide shows you how to build a production-ready sy
Long-Form Content Generation: Advanced Guide
Producing coherent, high-quality long-form AI content
Long-Form Content Generation: Advanced Guide Overview Producing coherent, high-quality long-form AI content. This comprehensive guide covers everything you need to know for production implementation. Why It Matters Long-Form Content Generation: A
Distributed AI Tracing
End-to-end tracing across AI service boundaries
Distributed AI Tracing Overview End-to-end tracing across AI service boundaries Implementation ```python from openai import OpenAI from pydantic import BaseModel from typing import Optional import json client = OpenAI() class Handler: """Ha
AI Service Health Checks
Implementing comprehensive health checks for AI APIs
AI Service Health Checks Overview Implementing comprehensive health checks for AI APIs Implementation ```python from openai import OpenAI from pydantic import BaseModel from typing import Optional import json client = OpenAI() class Handler:
AI Data Lake Architecture: Production Setup Guide
Building scalable data lakes for AI training data
AI Data Lake Architecture Overview Building scalable data lakes for AI training data. This guide provides practical, production-ready implementations. **Category**: ai-infrastructure **Primary Tool**: s3 **Tags**: infrastructure, devops, s3, p
AI Request Queue System: Production AI Architecture Guide 2026
How to implement handling burst AI traffic with queues
AI Request Queue System: Production Architecture 2026 Overview **AI Request Queue System** solves the challenge of handling burst AI traffic with queues. This guide covers the design decisions, implementation details, and trade-offs you need to kno
Self-Query RAG: Implementation Guide with Qdrant 2026
Build a AI-generated metadata filters for precise retrieval RAG system from scratch
Self-Query RAG: Complete Implementation 2026 Overview Self-Query RAG is a specialized retrieval pattern that focuses on AI-generated metadata filters for precise retrieval. This guide shows you how to build a production-ready system using Qdrant.
How to Build Multi-Modal AI App: Complete Guide for Developers 2026
Build a app that understands text and images step by step
How to Build Multi-Modal AI App 2026 Introduction In this tutorial, you'll learn how to **Build Multi-Modal AI App**. By the end, you'll have a working **app that understands text and images** that you can deploy and extend. **Prerequisites:** - E
AI Knowledge Distillation Pipeline: Advanced Guide
Distilling large model knowledge into smaller models
AI Knowledge Distillation Pipeline: Advanced Guide Overview Distilling large model knowledge into smaller models. This comprehensive guide covers everything you need to know for production implementation. Why It Matters AI Knowledge Distillation
Building Customer Support Agent with AI Agents: Complete Guide 2026
Create autonomous handle customer inquiries using your knowledge base using LLM agents
Building Customer Support Agent with AI Agents 2026 Introduction AI agents that can handle customer inquiries using your knowledge base are transforming how developers work. This guide shows you how to build a production-ready Customer Support Agen
AI Response Caching Layer: Production AI Architecture Guide 2026
How to implement semantic caching for LLM responses
AI Response Caching Layer: Production Architecture 2026 Overview **AI Response Caching Layer** solves the challenge of semantic caching for LLM responses. This guide covers the design decisions, implementation details, and trade-offs you need to kn
How to Build an AI Agent with Tool Use: Complete Guide for Developers 2026
Build a autonomous AI assistant step by step
How to Build an AI Agent with Tool Use 2026 Introduction In this tutorial, you'll learn how to **Build an AI Agent with Tool Use**. By the end, you'll have a working **autonomous AI assistant** that you can deploy and extend. **Prerequisites:** -
AI Request Queue Management
Managing request queues for AI inference workloads
AI Request Queue Management Overview Managing request queues for AI inference workloads Implementation ```python from openai import OpenAI from pydantic import BaseModel from typing import Optional import json client = OpenAI() class Handler:
ML Testing Strategies
Unit, integration, and regression testing for ML systems
ML Testing Strategies Overview Unit, integration, and regression testing for ML systems. This guide covers practical implementation for production ML systems. Why This Matters in MLOps Modern ML systems require rigorous operations practices: - **
Kubeflow ML Pipelines
Orchestrating ML workflows on Kubernetes with Kubeflow
Kubeflow ML Pipelines Overview Orchestrating ML workflows on Kubernetes with Kubeflow. This guide covers practical implementation for production ML systems. Why This Matters in MLOps Modern ML systems require rigorous operations practices: - **Re
Ensemble AI Systems: Advanced Guide
Combining multiple models for higher accuracy
Ensemble AI Systems: Advanced Guide Overview Combining multiple models for higher accuracy. This comprehensive guide covers everything you need to know for production implementation. Why It Matters Ensemble AI Systems: Advanced Guide is increasin
Model Registry Best Practices
Managing ML model lifecycle from development to production
Model Registry Best Practices Overview Managing ML model lifecycle from development to production. This guide covers practical implementation for production ML systems. Why This Matters in MLOps Modern ML systems require rigorous operations pract
Deployment of Fine-tuned Models: Hands-On Tutorial
Serving custom fine-tuned models with vLLM and TGI — step-by-step implementation guide
Deployment of Fine-tuned Models Overview Serving custom fine-tuned models with vLLM and TGI. This tutorial provides a complete, runnable implementation. Prerequisites ```bash Install required packages pip install transformers datasets peft trl ac
ML Model Versioning with DVC
Data Version Control for ML experiments and model tracking
ML Model Versioning with DVC Overview Data Version Control for ML experiments and model tracking. This guide covers practical implementation for production ML systems. Why This Matters in MLOps Modern ML systems require rigorous operations practi
AI Agent Architecture: Complete Developer Guide 2026
Master AI Agent Architecture with practical examples and production patterns
AI Agent Architecture: Complete Developer Guide 2026 Overview AI Agent Architecture is one of the most important concepts in modern AI development. This guide provides a thorough understanding with practical, production-ready examples. Why AI Agen
Graph RAG: Implementation Guide with Neo4j 2026
Build a knowledge graph traversal for multi-hop reasoning RAG system from scratch
Graph RAG: Complete Implementation 2026 Overview Graph RAG is a specialized retrieval pattern that focuses on knowledge graph traversal for multi-hop reasoning. This guide shows you how to build a production-ready system using Neo4j. Why Graph RAG
Fine-tuning Mistral Models: Hands-On Tutorial
Mistral 7B fine-tuning for domain specialization — step-by-step implementation guide
Fine-tuning Mistral Models Overview Mistral 7B fine-tuning for domain specialization. This tutorial provides a complete, runnable implementation. Prerequisites ```bash Install required packages pip install transformers datasets peft trl accelerat
Continuous Training Pipelines
Automated model retraining triggered by data or performance changes
Continuous Training Pipelines Overview Automated model retraining triggered by data or performance changes. This guide covers practical implementation for production ML systems. Why This Matters in MLOps Modern ML systems require rigorous operati
PostgreSQL with pgvector: Production Setup Guide
Vector similarity search using PostgreSQL pgvector extension
PostgreSQL with pgvector Overview Vector similarity search using PostgreSQL pgvector extension. This guide provides practical, production-ready implementations. **Category**: ai-infrastructure **Primary Tool**: postgresql **Tags**: infrastruct
AI事件驱动架构:Kafka+LLM构建实时AI响应系统
用事件流处理和LLM构建毫秒级响应的智能AI应用
介绍如何用Apache Kafka和LLM构建事件驱动的AI系统,包括实时流分析、异常检测触发、AI决策自动执行和多系统AI编排,实现高吞吐量的实时智能决策。
LLM Cost Optimization
Reducing LLM API costs in production through caching and batching
LLM Cost Optimization Overview Reducing LLM API costs in production through caching and batching. This guide covers practical implementation for production ML systems. Why This Matters in MLOps Modern ML systems require rigorous operations practi
知识图谱+LLM:GraphRAG和Graph-of-Thoughts的工程实现
用知识图谱增强LLM的推理能力,解决多跳问题和事实准确性
介绍知识图谱与LLM结合的技术方案,包括Microsoft GraphRAG的图检索实现、知识图谱构建自动化、Graph-of-Thoughts推理和医疗/法律等专业领域的知识图谱应用。
个性化推荐系统设计:从协同过滤到LLM增强的现代推荐架构
融合传统推荐算法和大语言模型,构建下一代个性化体验
介绍现代推荐系统的架构演进,从传统矩阵分解和深度学习推荐模型,到利用LLM理解用户意图和生成推荐理由,以及冷启动问题和实时特征的工程解决方案。
模型压缩与知识蒸馏:将70B模型能力迁移到7B模型
用知识蒸馏技术保留大模型能力,同时降低推理成本10倍
深入介绍模型压缩的核心技术,包括知识蒸馏、剪枝、量化和低秩分解,以及如何将大型教师模型的能力迁移到小型学生模型,在降低成本的同时保留核心能力。
AI SaaS多租户架构:隔离、配额和个性化的工程设计
构建可扩展的多租户AI SaaS平台,安全隔离客户数据和AI使用
介绍多租户AI SaaS应用的架构设计,包括模型和数据隔离策略、per-tenant配额管理、个性化AI定制、成本分配和企业级合规要求的技术实现。
Memory-Augmented AI Agent: Complete Tutorial
Implementing short and long-term memory for AI agents
Memory-Augmented AI Agent Overview Implementing short and long-term memory for AI agents. This guide covers architecture, implementation, and production deployment of AI agents. Agent Architecture ``` User Input ↓ Agent Orchestrator ↓ ┌──
AI流式推理最佳实践:SSE、WebSocket和流式响应的工程实现
实现低感知延迟的AI应用,让用户感受到即时响应而非等待
详细介绍实现AI流式输出的工程方案,包括服务端SSE/WebSocket实现、前端流式渲染、中间层缓冲设计和错误处理,以及如何在Next.js和FastAPI中实现生产就绪的流式AI应用。
AI数据标注质量管理:从众包到专家标注的完整质量控制体系
提升标注数据质量的系统方法,直接影响AI模型最终性能
介绍AI训练数据标注的质量管理方法,包括标注指南设计、标注一致性测量、质量控制流程、主动学习减少标注量,以及众包vs内部标注vs专家标注的选型策略。
LLM应用测试策略:单元测试、集成测试和端到端AI测试的完整方案
将传统软件测试实践应用到LLM应用,建立可靠的AI质量保证体系
提供LLM应用的完整测试策略,包括提示词单元测试、链路集成测试、端到端场景测试和回归测试框架,以及如何处理LLM输出的非确定性特点。
AI应用可观测性:Langfuse、Arize AI和Helicone的LLM监控实践
建立完整的LLM应用监控体系,实时掌握质量、成本和用户体验
介绍LLM应用的可观测性最佳实践,包括追踪链路设计、质量指标定义、成本监控告警和用户反馈整合,以及Langfuse、Arize AI等工具的实际使用方法。
Multi-Objective Optimization for LLMs: Advanced Guide
Balancing competing objectives in LLM applications
Multi-Objective Optimization for LLMs: Advanced Guide Overview Balancing competing objectives in LLM applications. This comprehensive guide covers everything you need to know for production implementation. Why It Matters Multi-Objective Optimizat
LangChain LCEL生产指南:构建可靠、可观测的LLM应用链
从原型到生产,掌握LangChain Expression Language的高级用法
深入介绍LangChain LCEL(LangChain Expression Language)的生产级使用,包括流式输出、并行链、错误处理、追踪集成和性能优化,帮助团队构建健壮的LLM应用。
Feedback Loop Architecture: Production AI Architecture Guide 2026
How to implement collecting and using feedback to improve AI quality
Feedback Loop Architecture: Production Architecture 2026 Overview **Feedback Loop Architecture** solves the challenge of collecting and using feedback to improve AI quality. This guide covers the design decisions, implementation details, and trade-
Sidecar Pattern for AI Logging
AI service logging with sidecar container pattern
Sidecar Pattern for AI Logging Overview AI service logging with sidecar container pattern Implementation ```python from openai import OpenAI from pydantic import BaseModel from typing import Optional import json client = OpenAI() class Handler:
AI Observability Stack: Production Setup Guide
Full observability for AI systems with OpenTelemetry
AI Observability Stack Overview Full observability for AI systems with OpenTelemetry. This guide provides practical, production-ready implementations. **Category**: ai-infrastructure **Primary Tool**: opentelemetry **Tags**: infrastructure, de
Federated Learning Patterns: Advanced Guide
Privacy-preserving distributed ML training
Federated Learning Patterns: Advanced Guide Overview Privacy-preserving distributed ML training. This comprehensive guide covers everything you need to know for production implementation. Why It Matters Federated Learning Patterns: Advanced Guide
QLoRA: Quantized LoRA: Hands-On Tutorial
Combining quantization with LoRA for 4-bit fine-tuning — step-by-step implementation guide
QLoRA: Quantized LoRA Overview Combining quantization with LoRA for 4-bit fine-tuning. This tutorial provides a complete, runnable implementation. Prerequisites ```bash Install required packages pip install transformers datasets peft trl accelera
Parent Document RAG: Implementation Guide with Chroma 2026
Build a retrieving small chunks with large parent context RAG system from scratch
Parent Document RAG: Complete Implementation 2026 Overview Parent Document RAG is a specialized retrieval pattern that focuses on retrieving small chunks with large parent context. This guide shows you how to build a production-ready system using C
Deploy Any ONNX Model on ONNX Runtime CrossPlatform — Cross-platform deployment
Complete setup guide for running Any ONNX Model locally on ONNX Runtime CrossPlatform for cross-platform deployment
Deploy Any ONNX Model on ONNX Runtime CrossPlatform Overview Run Any ONNX Model directly on ONNX Runtime CrossPlatform for cross-platform deployment. Local inference offers privacy, zero latency, and no ongoing API costs. **Specs**: ONNX Runtime ·