Building AI Recommendation Systems for E-Commerce: Beyond Collaborative Filtering
Modern approaches to personalization that drive conversion and retention
Building AI Recommendation Systems for E-Commerce: Beyond Collaborative Filtering
Modern approaches to personalization that drive conversion and retention
Learn how to build and deploy production recommendation systems using modern AI techniques—from two-tower neural networks and session-based recommendations to LLM-powered conversational shopping.
Building AI Recommendation Systems for E-Commerce: Beyond Collaborative Filtering
The Business Impact of Recommendations
Amazon attributes 35% of its revenue to recommendation systems. Netflix saves $1 billion annually by retaining subscribers through personalized content. For e-commerce companies, recommendations drive 26-30% of total revenue.
But most companies are still using simple collaborative filtering from 2010. Modern AI recommendation systems are dramatically more powerful.
Architecture: Two-Stage Retrieval and Ranking
Production recommendation systems use a two-stage architecture:
Stage 1: Candidate Retrieval
Goal: Reduce millions of items to hundreds of candidates
Speed: Must be < 10ms
Method: Approximate nearest neighbor (ANN) search on item embeddings Stage 2: Ranking
Goal: Rank hundreds of candidates by predicted conversion probability
Speed: Can be < 100ms (larger model)
Method: Deep learning ranking model with rich features This architecture enables both scale (billions of items) and accuracy (complex ranking model)
Stage 1: Two-Tower Neural Network for Retrieval
python
import tensorflow as tf
import tensorflow_recommenders as tfrsclass TwoTowerModel(tfrs.Model):
"""
Separate towers for user and item representations
Enables efficient ANN search at inference time
"""
def __init__(self, user_model, item_model, items_dataset):
super().__init__()
self.user_tower = user_model
self.item_tower = item_model
# Retrieval task
self.task = tfrs.tasks.Retrieval(
metrics=tfrs.metrics.FactorizedTopK(
candidates=items_dataset.batch(128).map(item_model)
)
)
def compute_loss(self, features, training=False):
user_embeddings = self.user_tower(features["user_id"])
item_embeddings = self.item_tower(features["item_id"])
return self.task(user_embeddings, item_embeddings)
User model - encodes user history and features
user_model = tf.keras.Sequential([
tf.keras.layers.StringLookup(vocabulary=unique_user_ids, mask_token=None),
tf.keras.layers.Embedding(len(unique_user_ids) + 1, 64),
tf.keras.layers.Dense(32, activation='relu')
])Item model - encodes item features
item_model = tf.keras.Sequential([
tf.keras.layers.StringLookup(vocabulary=unique_item_ids, mask_token=None),
tf.keras.layers.Embedding(len(unique_item_ids) + 1, 64),
tf.keras.layers.Dense(32, activation='relu')
])model = TwoTowerModel(user_model, item_model, items_dataset)
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))
model.fit(cached_train, epochs=3)
Stage 2: Deep Learning Ranking
python
import lightgbm as lgb
from sklearn.preprocessing import LabelEncoderdef train_ranking_model(interactions_df: pd.DataFrame) -> lgb.Booster:
"""
LambdaRank model for Learning-to-Rank
Optimizes for NDCG (ranking quality metric)
"""
features = [
# User features
'user_age_days', 'user_total_orders', 'user_avg_order_value',
'user_preferred_category', 'user_price_sensitivity',
# Item features
'item_price', 'item_category', 'item_rating', 'item_review_count',
'item_inventory_level', 'item_days_since_launch',
# Interaction features (cross features)
'user_item_category_affinity', 'user_item_price_match',
'user_viewed_similar', 'user_cart_abandonment_similar',
# Context features
'hour_of_day', 'day_of_week', 'platform', 'search_query_match'
]
# LambdaRank - optimizes ranking directly
train_data = lgb.Dataset(
interactions_df[features],
label=interactions_df['clicked'],
group=interactions_df.groupby('query_id').size().values
)
params = {
'objective': 'lambdarank',
'metric': 'ndcg',
'eval_at': [5, 10],
'num_leaves': 63,
'learning_rate': 0.05
}
model = lgb.train(params, train_data, num_boost_round=500)
return model
Session-Based Recommendations
python
GRU4Rec - session-based recommendations without user history
import torch
import torch.nn as nnclass GRU4Rec(nn.Module):
"""
Predicts next item in session based on current session sequence
Works for anonymous users and new users (cold start problem solved)
"""
def __init__(self, num_items: int, hidden_size: int = 100, num_layers: int = 1):
super().__init__()
self.embedding = nn.Embedding(num_items + 1, hidden_size, padding_idx=0)
self.gru = nn.GRU(hidden_size, hidden_size, num_layers, batch_first=True)
self.output_layer = nn.Linear(hidden_size, num_items)
def forward(self, session_items: torch.Tensor) -> torch.Tensor:
# session_items: [batch_size, session_length]
embedded = self.embedding(session_items)
gru_output, _ = self.gru(embedded)
last_hidden = gru_output[:, -1, :] # Last item's hidden state
scores = self.output_layer(last_hidden)
return scores # Probability over all items
For new user: recommend based on current session behavior
Works in first session, no history required
LLM-Powered Conversational Recommendations
python
class ConversationalShoppingAssistant:
def __init__(self, product_catalog: list, vector_store):
self.catalog = product_catalog
self.vector_store = vector_store
self.client = anthropic.Anthropic()
self.conversation_history = []
def recommend(self, user_message: str, user_profile: dict) -> str:
# Search for relevant products
relevant_products = self.vector_store.search(user_message, top_k=20)
# Format context
product_context = "
".join([
f"- {p['name']}: {p['description']} (${p['price']}) - {p['rating']} stars"
for p in relevant_products[:10]
])
self.conversation_history.append({
"role": "user",
"content": user_message
})
response = self.client.messages.create(
model="claude-opus-4-5",
max_tokens=1000,
system=f"""You are a helpful shopping assistant.
User profile: {user_profile['preferences']}
Budget: up to ${user_profile['budget']}Available products matching their request:
{product_context}
Provide personalized recommendations with clear reasoning.""",
messages=self.conversation_history
)
assistant_message = response.content[0].text
self.conversation_history.append({
"role": "assistant",
"content": assistant_message
})
return assistant_message
Results in 40% higher click-through vs traditional recommendations
A/B Testing Recommendation Systems
python
Bandits for recommendation exploration
from vowpalwabbit import pyvwdef contextual_bandit_recommendations(user_features: dict, item_pool: list) -> list:
"""
Contextual bandits balance exploration (trying new items)
with exploitation (showing proven items)
Result: Continuously improving recommendations without formal A/B tests
"""
# ε-greedy: explore 10% of the time
if random.random() < 0.10:
return random.sample(item_pool, 10) # Explore
else:
return top_ranked_items(user_features, item_pool, n=10) # Exploit
Production Deployment Considerations
Key Takeaways
相关工具
相关教程
From collaborative filtering to real-time personalization at scale
Building scalable vision AI systems for real-world applications
How deep learning is transforming radiology, pathology, and ophthalmology
Implementing computer vision product discovery for e-commerce and mobile apps
Clustering algorithms and RFM analysis that drive targeted marketing
How AutoML and AI assistants are democratizing data science