WhyLabs AI Observatory: Complete Setup Guide
Real-time data and AI monitoring with WhyLabs
WhyLabs & ML Observability: Setup Guide
WhyLabs built its platform on an idea that outlives any vendor: monitor statistical profiles of your data, not the raw data itself. Its open-source library whylogs generates compact statistical summaries (distributions, missing rates, cardinalities) of whatever flows through your pipeline; the platform watches those profiles for drift and anomalies over time. This guide covers the profile-based monitoring pattern, setup, and where it fits in an LLM-era observability stack. *(Vendor landscape note: check the company's current product status and pricing before committing — this category has consolidated repeatedly; the whylogs pattern itself is open source and portable.)*
The core idea: profiles, not payloads
python
import whylogs as why
import pandas as pddf = pd.DataFrame(batch_of_predictions) # features + predictions + metadata
profile = why.log(df) # statistical summary, NOT the rows
profile.writer('whylabs').write() # or write locally / to your own store
A profile captures per-column distributions, null rates, type counts, and frequent items in kilobytes — the raw data never leaves your boundary, which is why this pattern clears privacy review where log-everything tools don't (GDPR-friendly by construction). Profiles from every batch/hour/day line up into time series, and monitoring becomes: *did today's distribution shift against the baseline?*
What you catch with it
Setup is: profile every scoring batch → set baselines (training data or a stable window) → alert on divergence metrics per column → route to the owning team.
The LLM-era extension
The same pattern extends to text systems, with embeddings and metrics standing in for tabular columns:
Production patterns
FAQ
Do I need this if I have Datadog/Grafana? APM monitors *systems* (latency, errors); this monitors *data and predictions*. The profile metrics can land in your existing dashboards — the gap it fills is statistical, not infrastructural.
Open-source-only path? whylogs profiles + your own storage + scheduled comparison jobs gets you 70% of the value without a platform — a scheduled pipeline away.
When is this overkill? Single low-stakes model, labels arrive instantly, volume is small — eyeball a dashboard. The pattern earns its keep when labels lag, volume is real, or compliance asks "how would you know if the model degraded?"
*Last updated: June 2026. Verify current WhyLabs product status and the maintained text-metrics toolkit before adopting; the whylogs pattern is OSS regardless.*
Also available in 中文.