LlamaIndex Practical Guide: RAG Application Development from Beginner to Production
LlamaIndex vs LangChain: How to Choose? 5 Real-World Code Examples
LlamaIndex Practical Guide: RAG Application Development from Beginner to Production
LlamaIndex vs LangChain: How to Choose?
In a nutshell: LlamaIndex focuses on data indexing and retrieval, while LangChain focuses on agent orchestration and chaining.
Selection Principle: Use LlamaIndex for RAG knowledge bases; use LangChain for agent workflows; they can be combined.
Installation
bash
pip install llama-index llama-index-llms-openai llama-index-embeddings-openai
Scenario 1: Build a Document Q&A System in 5 Minutes
python
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI
from llama_index.core import SettingsSettings.llm = OpenAI(model="gpt-4o", api_key="sk-...")
Settings.embed_model = "text-embedding-3-small"
Load documents (supports PDF, Word, TXT, HTML, etc.)
documents = SimpleDirectoryReader("./docs").load_data()
index = VectorStoreIndex.from_documents(documents)query_engine = index.as_query_engine()
response = query_engine.query("What is the core conclusion of this document?")
print(response)
Scenario 2: Persistent Storage (Essential for Production)
python
import os
from llama_index.core import StorageContext, load_index_from_storagePERSIST_DIR = "./storage"
if not os.path.exists(PERSIST_DIR):
documents = SimpleDirectoryReader("./docs").load_data()
index = VectorStoreIndex.from_documents(documents)
index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
Scenario 3: Multi-Source Documents with Metadata
python
from llama_index.core import Document
from llama_index.core.vector_stores import MetadataFilters, MetadataFilter, FilterOperatordocs = [
Document(text="Q3 financial report shows revenue growth of 23%...",
metadata={"source": "financial_report", "year": 2025, "quarter": "Q3"}),
Document(text="Product roadmap: new features to be released in Q1 2026...",
metadata={"source": "internal_doc", "type": "roadmap"})
]
index = VectorStoreIndex.from_documents(docs)
Query filtered by source
query_engine = index.as_query_engine(
filters=MetadataFilters(filters=[
MetadataFilter(key="source", value="financial_report", operator=FilterOperator.EQ)
])
)
Scenario 4: Connect to Qdrant Vector Database
python
from llama_index.vector_stores.qdrant import QdrantVectorStore
import qdrant_clientclient = qdrant_client.QdrantClient(url="http://localhost:6333")
vector_store = QdrantVectorStore(client=client, collection_name="my_docs")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
Scenario 5: Streaming Output
python
query_engine = index.as_query_engine(streaming=True)
streaming_response = query_engine.query("Please explain this issue in detail")
for text in streaming_response.response_gen:
print(text, end="", flush=True)
Production Best Practices
Incremental Index Updates (avoid full rebuild each time):
python
existing_docs = index.ref_doc_info
for doc in new_documents:
if doc.doc_id not in existing_docs:
index.insert(doc)
Tune Retrieval Parameters:
python
query_engine = index.as_query_engine(
similarity_top_k=5,
response_mode="tree_summarize", # suitable for long document summarization
)
FAQ
Q: What document formats are supported? A: PDF, Word, PPT, Excel, HTML, Markdown, TXT, CSV, JSON, as well as databases, Notion, Google Drive, and 100+ other sources.
Q: Does it work well with Chinese? A: Full Chinese support. We recommend using the BGE Chinese Embedding model, which performs better and is cheaper than OpenAI Embedding.
Q: What is the relationship with Dify? A: Dify provides a visual interface and can integrate LlamaIndex's retrieval capabilities under the hood. Use LlamaIndex for custom development, and Dify for rapid prototyping.
Further Reading
Also available in 中文.