You cannot secure what you cannot see. When AI agents browse the open web, every navigation event needs structured metadata — IAB category, page type, reputation score, and policy verdict — logged in real time. Our 102 million domain database enriches every agent URL visit with classification data, transforming opaque browsing sessions into fully observable, auditable event streams.
Most agent frameworks log the raw URLs an agent visits, but raw URLs tell you almost nothing. Was that URL a news article or a banking portal? A documentation page or a login screen? Without classification metadata, your observability pipeline is blind.
Enterprise observability platforms like Datadog, Splunk, and Grafana can ingest agent URL logs, but without enrichment, these logs are just lists of domain names. A security analyst reviewing thousands of agent-visited URLs has no efficient way to determine which visits represent normal task behavior and which represent policy violations. They cannot filter by category ("show me all financial site visits"), by page type ("show me all login page attempts"), or by risk level ("show me all visits to domains with reputation scores below 3").
Our 102 million domain database transforms raw URL logs into structured observability events. Before or after every agent navigation, the URL is classified against the database to produce a rich metadata record: IAB v3 categories (up to 4 tiers), web filtering category, page type, OpenPageRank score, global popularity rank, and the policy verdict (allow/block/review). This metadata is attached to the log event and forwarded to your observability platform, where it becomes queryable, filterable, and alertable.
The enrichment transforms "the agent visited app.example.com" into "the agent visited a Financial Services domain (IAB: Business and Finance > Financial Services > Banking), page type: login, PageRank: 7.2, policy verdict: BLOCKED." This single enriched log line tells the security analyst everything they need to know — the visit was to a banking login page, it was blocked by policy, and the domain is well-known (high PageRank). No further investigation needed.
Three observability layers that turn raw agent URL logs into actionable intelligence
Every agent navigation produces a metric event tagged with the domain's IAB category, page type, and policy verdict. These metrics feed directly into Prometheus, Datadog, or any StatsD-compatible backend. Build dashboards that show agent navigation volume by category, blocked request rates by page type, and reputation score distributions — all in real time with sub-second latency from event to visualization.
Augment every log entry with classification metadata from the 102M database. The enriched log includes the raw URL, resolved domain, IAB categories at all available tiers, web filtering category, page type label, PageRank score, global popularity rank, country-level rank, and the policy decision. Ship these enriched logs to Elasticsearch, Splunk, or CloudWatch Logs for full-text search and faceted filtering by any metadata dimension.
Define alert rules that trigger on category-level anomalies. Alert when an agent visits more than 5 financial domains in an hour. Alert when any agent accesses a domain with a PageRank score below 2 (indicating a low-reputation or newly registered domain). Alert when blocked request rates exceed a threshold, which may indicate a misbehaving agent or a configuration error. Category metadata makes these alerts possible — raw URLs cannot support them.
Production-ready snippets to add category-enriched observability to your agent pipeline
import http.client
import json
import logging
from datetime import datetime, timezone
class AgentObservabilityLogger:
"""Enriches agent URL visits with category metadata."""
def __init__(self, api_key):
self.api_key = api_key
self.conn = http.client.HTTPSConnection(
"www.websitecategorizationapi.com"
)
self.logger = logging.getLogger("agent_observability")
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter(
'%(asctime)s %(message)s'
))
self.logger.addHandler(handler)
self.logger.setLevel(logging.INFO)
def classify(self, url):
payload = (
f"query={url}"
f"&api_key={self.api_key}"
f"&data_type=url"
f"&expanded_categories=1"
)
headers = {
"Content-Type": "application/x-www-form-urlencoded"
}
self.conn.request(
"POST",
"/api/iab/iab_web_content_filtering.php",
payload, headers
)
return json.loads(
self.conn.getresponse().read().decode("utf-8")
)
def log_navigation(self, agent_id, url, policy_verdict):
data = self.classify(url)
categories = [
c[0].split("Category name: ")[1]
for c in data.get("iab_classification", [])
]
page_type = data.get("page_type", "unknown")
web_filter = data.get(
"filtering_taxonomy", [[""]]
)[0][0].replace("Category name: ", "")
event = {
"timestamp": datetime.now(timezone.utc).isoformat(),
"agent_id": agent_id,
"url": url,
"domain": url.split("/")[2] if "/" in url else url,
"iab_categories": categories,
"page_type": page_type,
"web_filter_category": web_filter,
"pagerank": data.get("open_page_rank", 0),
"popularity_rank": data.get("global_rank", 0),
"policy_verdict": policy_verdict,
"event_type": "agent_navigation"
}
self.logger.info(json.dumps(event))
return event
# Usage — log every agent navigation
obs = AgentObservabilityLogger(api_key="your_api_key")
event = obs.log_navigation(
agent_id="research-agent-001",
url="https://news.example.com/tech/ai-trends",
policy_verdict="allowed"
)
class AgentMetricsEmitter {
constructor(apiKey, metricsEndpoint) {
this.apiKey = apiKey;
this.metricsEndpoint = metricsEndpoint;
this.counters = {
total: 0, allowed: 0,
blocked: 0, byCategory: {}
};
}
async enrichAndEmit(agentId, url, verdict) {
const res = await fetch(
"https://www.websitecategorizationapi.com" +
"/api/iab/iab_web_content_filtering.php",
{
method: "POST",
headers: {
"Content-Type": "application/x-www-form-urlencoded"
},
body: new URLSearchParams({
query: url,
api_key: this.apiKey,
data_type: "url",
expanded_categories: "1"
})
}
);
const data = await res.json();
const category =
data.iab_classification?.[0]?.[0]
?.replace("Category name: ", "") || "Unknown";
const pageType = data.page_type || "unknown";
// Update counters
this.counters.total++;
this.counters[verdict]++;
this.counters.byCategory[category] =
(this.counters.byCategory[category] || 0) + 1;
// Emit structured metric
const metric = {
name: "agent.navigation",
tags: {
agent_id: agentId,
category: category,
page_type: pageType,
verdict: verdict
},
value: 1,
timestamp: Date.now()
};
// Send to metrics backend
await fetch(this.metricsEndpoint, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(metric)
});
return metric;
}
}
Purpose-built domain databases for AI agent filtering. Includes IAB categories, 20+ page types, reputation scores, and popularity rankings. One-time purchase with perpetual license.
10 Million Domains with Page-Type Intelligence
One-time purchase: Perpetual license | Optional Updates: $1,599/year
20 Million Domains with Full Intelligence Suite
One-time purchase: Perpetual license | Optional Updates: $2,999/year
50 Million Domains with Complete Intelligence Suite
One-time purchase: Perpetual license | Optional Updates: $4,999/year
Also available: Enterprise URL Database up to 102M domains from $2,499. View all database tiers →
Search any IAB or Web Filtering category to see how many domains are in our 102M Enterprise Database — the same data your AI agent filtering rules will reference.
How 102 million domains from our main Enterprise Database are distributed across IAB v3 taxonomy classifications
Spanning Tier 1 through Tier 4 classifications from our 102M Enterprise Database
Charts display domain counts for the top 50 out of 700+ categories in our 102M Enterprise Database. To check the number of domains for the remaining 650+ categories, use the Category Counter tool above .
Observability for AI agents is fundamentally different from observability for traditional software systems. A web server generates predictable, structured logs — request method, URL path, response code, latency. You can build dashboards and alerts around these well-defined metrics because the output space is bounded. AI agents, by contrast, generate unpredictable, unstructured navigation events — they visit arbitrary websites based on task context, model reasoning, and real-time search results. Without enrichment, an agent's browsing log is a stream of URLs with no semantic context.
Domain-level observability solves this problem by mapping every URL to a structured classification before it enters your observability pipeline. The classification — IAB category, page type, reputation score — provides the semantic context that transforms a raw URL log into an analyzable event stream. You can now ask questions that were previously impossible: "What percentage of agent web traffic goes to financial sites?" "How many login page access attempts occurred this week?" "Which agents are visiting domains with reputation scores below 3?" These questions are the foundation of effective agent governance.
Traditional observability rests on three pillars: metrics, logs, and traces. For AI agents, each pillar is enhanced by domain-level classification data. Metrics gain category dimensions — instead of just counting "agent URL visits per minute," you count "agent URL visits per minute by IAB category and page type." Logs gain structured classification fields — instead of just recording the raw URL, you record the URL alongside its IAB categories, page type, reputation score, and policy verdict. Traces gain semantic context — instead of showing "agent visited URL A, then URL B, then URL C," you show "agent visited a News article, then a Technology blog, then attempted a Financial Services login page (BLOCKED)."
This enrichment happens at the point of data collection, not at query time. The classification data is pre-computed in the 102M domain database, so enriching a log event adds sub-millisecond latency. By the time the event reaches your observability platform, it already carries all the metadata needed for filtering, aggregation, and alerting. No post-ingestion enrichment pipelines needed.
With category-enriched observability data, you can build dashboards that provide genuine insight into agent behavior — not just volume metrics. A category distribution panel shows the percentage of agent traffic going to each IAB Tier 1 category, revealing whether agents are staying within their expected task scope. A page-type breakdown shows the distribution of page types visited — a healthy agent should visit mostly homepage, blog, documentation, and pricing pages; spikes in login, admin, or checkout page types indicate potential issues.
A reputation score distribution chart shows the PageRank distribution of domains visited. Agents should overwhelmingly visit high-reputation domains (scores 5-10). A shift toward low-reputation domains (scores 0-2) may indicate that agents are following links to spam sites, phishing pages, or newly registered domains that have not yet been evaluated. A policy verdict timeline shows allow/block/review decisions over time, enabling you to detect policy configuration issues (too many blocks halting agent work) or security incidents (sudden spike in blocked requests from a single agent).
Category metadata enables anomaly detection algorithms that are impossible with raw URL data. Define baseline category distributions for each agent type — a research agent typically visits 60% News, 25% Technology, 10% Business, 5% other. When the actual distribution deviates significantly from baseline — say, the agent suddenly spends 40% of its traffic on Shopping sites — the anomaly detection system flags the deviation for investigation. This could indicate a prompt injection attack directing the agent to shopping sites, a task misunderstanding, or a data quality issue in the agent's input.
Page-type anomalies are equally valuable. If an agent that normally visits 0 login pages per day suddenly attempts 15 login page visits in an hour, this is a strong signal of either agent malfunction or adversarial input. The domain database makes this detection trivial — page type is a first-class field in every enriched log event, so a simple count query identifies the anomaly without any machine learning.
Regulatory frameworks increasingly require organizations to demonstrate control over AI system behavior. SOC 2 audits ask about access controls and monitoring. GDPR requires demonstrable data protection measures. Industry-specific regulations in healthcare (HIPAA), finance (SOX, PCI DSS), and government (FedRAMP) all include requirements for audit logging and access monitoring. Category-enriched agent observability data satisfies these requirements by providing a complete, structured record of every domain an agent visited, its classification, and the policy decision that was applied.
Compliance reports can be generated directly from the enriched log data: "In Q4 2025, our research agents made 245,000 web navigation requests. 99.2% were allowed under policy. 0.8% were blocked — 0.5% due to category blocks (Adult, Gambling, Malware) and 0.3% due to page-type blocks (login, admin, checkout). Zero financial services domains were accessed by any agent. Zero HR platform domains were accessed by any agent." This level of reporting is only possible with category-enriched observability.
The domain classification database integrates with your existing observability infrastructure rather than replacing it. The enrichment happens at the agent harness level — before the log event is emitted — so the enriched data flows into whatever backend you already use. For Datadog, emit enriched events as custom metrics with category and page-type tags. For Splunk, format enriched logs as JSON events that Splunk's field extraction automatically parses. For Elasticsearch, index enriched events with category fields mapped as keyword types for efficient faceted search. For Prometheus, expose category-level counters as labeled metrics that Grafana can visualize.
The integration is lightweight because the enrichment is a simple database lookup, not a complex ETL pipeline. The agent harness calls the database, receives the classification, attaches it to the log event, and emits the event to your existing log shipper (Fluentd, Logstash, Vector, or the platform's native agent). No new infrastructure required — just a classification lookup step added to your existing agent middleware.
Without category enrichment, organizations resort to manual URL review — security analysts scrolling through lists of thousands of URLs per day, manually visiting each one to determine its category and risk level. At an average investigation time of 30 seconds per URL and 10,000 agent URL visits per day, this requires approximately 83 hours of analyst time daily — an impossibility for any security team. The alternative is to ignore the logs entirely, which means accepting that agent web traffic is completely unmonitored.
Category enrichment from the 102M database eliminates this manual review entirely. Every URL is automatically classified, every log event carries structured metadata, and analysts can filter and aggregate by category instead of reviewing individual URLs. The database is a one-time purchase that replaces an infinite manual review workload with an automated, sub-millisecond enrichment pipeline.
Transform opaque agent web traffic into fully observable, category-enriched event streams. One-time purchase, perpetual license, 102 million domains classified and ready.