Enterprises are deploying AI agents that browse the open web — but existing security controls were designed for human users, not autonomous software. A 102 million domain database with IAB categories, page-type detection, and reputation scoring gives your security team the governance layer they need to enforce real policies on real agent traffic.
CASBs, web proxies, and firewall rules all assume a human is behind the browser. When an AI agent browses independently, every one of those assumptions breaks.
Enterprise security stacks — Zscaler, Netskope, Palo Alto Prisma — are built around the concept of a user session. They track browser fingerprints, enforce SSO-gated access, and apply policies per employee identity. AI agents do not authenticate through SSO. They do not have browser fingerprints. They operate in headless environments, making HTTP requests at machine speed without any of the signals your security stack relies on. The result is a blind spot the size of the entire internet.
Instead of retrofitting human-centric security tools, enterprises need a data layer that sits inside the agent's own execution environment. Our 102 million domain database provides exactly this: a pre-computed classification of every domain an agent will encounter, including IAB v3 taxonomy categories, web filtering labels, page-type detection (login, checkout, admin, settings, and 16 more), reputation scores, and global popularity rankings.
The database deploys as a local lookup table — Redis, PostgreSQL, SQLite, or any key-value store — so every URL check completes in under 1ms with zero external API calls. Your agent's middleware queries the database before each navigation event, evaluates the result against your enterprise policy rules, and either allows or blocks the request. No network hop, no cloud dependency, no latency penalty. The same governance that CASBs provide for human browsing, delivered as a data product for autonomous agents.
Three deployment models that bring enterprise-grade governance to every agent in your organization
Deploy the 102M database on a centralized policy server that all agents in your organization query before navigating. Define enterprise-wide rules: block all adult content, restrict financial services to research agents only, flag healthcare sites for compliance review. Every agent checks in, gets a verdict, and logs the decision — giving your security team a single pane of glass for agent web traffic.
Not all agents need the same access. A customer research agent should browse e-commerce and review sites. A legal compliance agent should access government and regulatory sites. Assign each agent a policy profile that maps IAB categories and page types to allow/block/review decisions. The database provides the classification; your policy engine provides the logic.
Every domain lookup generates a structured log entry: URL, IAB category, page type, policy decision, agent identity, timestamp. Feed these logs into your SIEM (Splunk, Datadog, Elastic) for real-time dashboards, anomaly detection, and compliance reporting. When auditors ask where your agents have been browsing, you have the receipts.
Production-ready snippets for implementing enterprise-grade agent web access controls
import http.client
import json
from datetime import datetime
class EnterpriseGuardrailService:
"""Centralized guardrail that enforces enterprise web
access policies for all AI agents."""
RISK_CATEGORIES = {
"hard_block": ["Adult", "Malware", "Illegal Content",
"Gambling", "Weapons"],
"soft_block": ["Cryptocurrency", "Hacking", "Tobacco"],
"review": ["Financial Services", "Healthcare",
"Government"]
}
BLOCKED_PAGE_TYPES = ["login", "checkout", "admin",
"settings", "signup"]
def __init__(self, api_key, agent_id):
self.api_key = api_key
self.agent_id = agent_id
self.conn = http.client.HTTPSConnection(
"www.websitecategorizationapi.com"
)
self.audit_log = []
def evaluate_url(self, target_url):
payload = (
f"query={target_url}"
f"&api_key={self.api_key}"
f"&data_type=url"
f"&expanded_categories=1"
)
headers = {
"Content-Type": "application/x-www-form-urlencoded"
}
self.conn.request(
"POST",
"/api/iab/iab_web_content_filtering.php",
payload,
headers
)
res = self.conn.getresponse()
data = json.loads(res.read().decode("utf-8"))
return self._apply_enterprise_policy(target_url, data)
def _apply_enterprise_policy(self, url, data):
categories = [
c[0].split("Category name: ")[1]
for c in data.get("iab_classification", [])
]
page_type = data.get("page_type", "unknown")
decision = "allow"
reason = "Passed all enterprise policy checks"
if page_type in self.BLOCKED_PAGE_TYPES:
decision = "block"
reason = f"Blocked page type: {page_type}"
else:
for cat in categories:
for blocked in self.RISK_CATEGORIES["hard_block"]:
if blocked.lower() in cat.lower():
decision = "block"
reason = f"Hard block category: {cat}"
for flagged in self.RISK_CATEGORIES["review"]:
if flagged.lower() in cat.lower():
decision = "review"
reason = f"Requires review: {cat}"
entry = {
"timestamp": datetime.utcnow().isoformat(),
"agent_id": self.agent_id,
"url": url,
"categories": categories,
"page_type": page_type,
"decision": decision,
"reason": reason
}
self.audit_log.append(entry)
return decision, reason
# Usage
guardrail = EnterpriseGuardrailService(
api_key="your_key", agent_id="research-agent-01"
)
verdict, msg = guardrail.evaluate_url("https://example.com")
print(f"Decision: {verdict} — {msg}")
async function enterprisePolicyGateway(targetURL, agentConfig) {
const response = await fetch(
"https://www.websitecategorizationapi.com" +
"/api/iab/iab_web_content_filtering.php",
{
method: "POST",
headers: {
"Content-Type": "application/x-www-form-urlencoded"
},
body: new URLSearchParams({
query: targetURL,
api_key: agentConfig.apiKey,
data_type: "url",
expanded_categories: "1"
})
}
);
const classification = await response.json();
const pageType = classification.page_type || "unknown";
const iabCategories =
classification.iab_classification?.map(
c => c[0]?.replace("Category name: ", "")
) || [];
const guardrailResult = {
url: targetURL,
agentId: agentConfig.agentId,
categories: iabCategories,
pageType: pageType,
action: "allow",
timestamp: new Date().toISOString()
};
// Enforce enterprise guardrail rules
if (agentConfig.blockedPageTypes.includes(pageType)) {
guardrailResult.action = "block";
guardrailResult.reason = `Page type "${pageType}" blocked`;
}
for (const cat of iabCategories) {
if (agentConfig.blockedCategories.some(
b => cat.toLowerCase().includes(b.toLowerCase())
)) {
guardrailResult.action = "block";
guardrailResult.reason = `Category "${cat}" blocked`;
}
}
return guardrailResult;
}
Purpose-built domain databases for AI agent filtering. Includes IAB categories, 20+ page types, reputation scores, and popularity rankings. One-time purchase with perpetual license.
10 Million Domains with Page-Type Intelligence
One-time purchase: Perpetual license | Optional Updates: $1,599/year
20 Million Domains with Full Intelligence Suite
One-time purchase: Perpetual license | Optional Updates: $2,999/year
50 Million Domains with Complete Intelligence Suite
One-time purchase: Perpetual license | Optional Updates: $4,999/year
Also available: Enterprise URL Database up to 102M domains from $2,499. View all database tiers →
Search any IAB or Web Filtering category to see how many domains are in our 102M Enterprise Database — the same data your enterprise guardrails will reference.
How 102 million domains from our main Enterprise Database are distributed across IAB v3 taxonomy classifications
Spanning Tier 1 through Tier 4 classifications from our 102M Enterprise Database
Charts display domain counts for the top 50 out of 700+ categories in our 102M Enterprise Database. To check the number of domains for the remaining 650+ categories, use the Category Counter tool above .
The enterprise adoption of agentic AI is accelerating at a pace that outstrips security team readiness. According to industry surveys, over 60% of Fortune 500 companies have at least one AI agent in production that accesses external websites. Yet fewer than 15% of those deployments have purpose-built controls governing where those agents can navigate. The gap between agent deployment speed and governance readiness is the single largest unaddressed risk in enterprise AI today.
Traditional security architectures were designed around a fundamental assumption: a human decides which URLs to visit. Firewalls, web proxies, and CASBs intercept traffic from browsers operated by authenticated employees. The security team writes rules that say "block gambling sites for all users" or "allow financial services only for the treasury department." These rules work because the subject is always a human with an identity, a department, and a risk profile. AI agents break every one of these assumptions. They do not have departments. They do not authenticate through your SSO provider. They operate in headless environments that your web proxy cannot even see.
An effective guardrail system for agentic AI web access has four components. First, a classification layer that maps every URL the agent encounters to a structured taxonomy — this is where the 102M domain database fits. Second, a policy engine that evaluates each classification against enterprise rules (block adult content, restrict financial services, flag healthcare for compliance review). Third, an enforcement point that sits between the agent's decision to navigate and the actual HTTP request. Fourth, an audit trail that records every classification, every policy decision, and every enforcement action.
The classification layer is the foundation. Without it, the policy engine has nothing to evaluate. Without the policy engine, the classification is just metadata. Without the enforcement point, the policy engine is advisory only. Without the audit trail, you cannot prove compliance. All four components must work together, and the classification layer — the 102M domain database — is the piece most organizations lack.
Cloud Access Security Brokers (CASBs) like Netskope, Zscaler, and Microsoft Defender for Cloud Apps work by intercepting traffic at the network layer. They install certificates on employee devices, inspect TLS traffic, and apply URL filtering rules based on the employee's identity and device posture. AI agents bypass every one of these mechanisms. They run in cloud compute environments, not on managed devices. They do not have device certificates. Their TLS traffic does not route through the corporate proxy unless explicitly configured to do so — and most agent frameworks do not support proxy configuration out of the box.
Even if you manage to route agent traffic through your CASB, the categorization data inside the CASB was designed for human browsing patterns. CASBs categorize domains into broad buckets like "social media," "streaming," and "productivity." They do not distinguish between a login page, a checkout page, and a settings panel on the same domain. For agent governance, page-type granularity is essential. An agent that visits amazon.com/products is doing legitimate research. An agent that visits amazon.com/signin is a security incident. CASBs cannot make this distinction because they operate at the domain level, not the page level.
The 102M domain database serves as the fundamental security primitive for enterprise agent governance. It is the data layer that makes policy enforcement possible. Each record in the database contains a domain, its IAB v3 taxonomy classification (four tiers of increasing specificity), its web filtering category (security-focused labels like Malware, Phishing, Adult, Gambling), its page-type labels (login, checkout, admin, settings, signup, pricing, careers, contact, and 12 more), its OpenPageRank authority score, and its global popularity ranking.
This multi-dimensional classification enables policy rules that are both broad and precise. Block all domains in the "Adult" web filtering category — that is a broad rule. Block navigation to any page with a "login" page type on domains outside the approved vendor list — that is a precise rule. Both rules use the same data source. Both rules execute in the same sub-millisecond lookup. The database provides the vocabulary; your policy engine provides the grammar.
Enterprises typically operate multiple AI agents with different missions. A customer research agent needs access to review sites, competitor websites, and industry publications. A legal compliance agent needs access to government databases, regulatory filings, and court records. A marketing agent needs access to social media platforms, advertising networks, and brand monitoring tools. Giving all agents the same access policy is both too permissive and too restrictive — too permissive because the marketing agent should not browse financial databases, too restrictive because the compliance agent needs access to sites that would normally be blocked for other roles.
The solution is role-based access control (RBAC) for agents, powered by domain categorization. Define an access profile for each agent role that specifies which IAB categories and page types are allowed, blocked, or flagged for review. The customer research agent gets access to IAB categories like "Shopping," "Technology & Computing," and "Business and Finance" — but not "Financial Services > Banking" or any page types classified as "checkout" or "login." The compliance agent gets access to "Law, Government, & Politics" and "Business and Finance" but not "Entertainment" or "Adult." Each profile is a mapping from database fields to policy actions.
Once your guardrail system is logging every agent navigation decision, you unlock real-time monitoring and anomaly detection. Set up alerts for patterns that indicate agent misbehavior: an agent visiting more than 10 unique domains per minute, an agent repeatedly hitting domains in blocked categories (which might indicate prompt injection), an agent navigating to page types it has never visited before, or an agent accessing domains with low reputation scores. These anomaly signals are only possible because the domain database provides the structured metadata that raw URLs lack.
Feed the guardrail logs into your existing SIEM — Splunk, Datadog, Elastic, or Sentinel. Create dashboards that show agent traffic by category, page type, policy decision, and agent identity. Set up automated playbooks that quarantine an agent when it triggers specific anomaly thresholds. The domain database turns agent web traffic from an opaque stream of URLs into structured, actionable intelligence that your security operations team can actually work with.
Regulators are beginning to pay attention to AI agent activity. The EU AI Act includes provisions for high-risk AI systems that interact with external data sources. GDPR requires organizations to document and justify the data processing activities of their AI systems — including data accessed by agents browsing the web. SOC 2 Type II audits increasingly ask about AI governance controls. Without a documented, auditable guardrail system, enterprises face regulatory risk from their agent deployments.
The domain database provides the evidence trail that auditors require. Every domain the agent visited is classified. Every policy decision is logged. The database itself serves as documentation of the classification methodology — IAB taxonomy, web filtering categories, and page-type labels are industry-standard frameworks that auditors understand. When a regulator asks "how do you control what your AI agents access on the web," the answer is a structured system with pre-classified domains, deterministic policy rules, and immutable audit logs — not "we prompt the agent to be careful."
For enterprises running dozens or hundreds of agents, the deployment architecture matters. The recommended pattern is a centralized classification service backed by the 102M database in Redis or PostgreSQL. Each agent's middleware calls the classification service via a local gRPC or REST endpoint. The classification service returns the domain's categories, page types, and reputation in a single response. The agent's middleware then evaluates the response against the agent-specific policy profile and enforces the decision.
This centralized architecture ensures consistency — all agents use the same classification data — and simplifies updates. When you refresh the database (quarterly updates available), you update a single data store rather than redeploying every agent. The centralized service also becomes the natural logging point: every lookup is recorded, creating the audit trail that compliance teams require. For organizations that need multi-region deployment, replicate the database to each region and have agents query their local instance for sub-millisecond latency regardless of geography.
Stop retrofitting human security tools for agent traffic. Deploy a purpose-built domain intelligence layer that gives your security team real governance over every AI agent in your organization.