WebsiteCategorizationAPI
Home
Demo Tools - Categorization
Website Categorization Text Classification URL Database Taxonomy Mapper
Demo Tools - Website Intel
Technology Detector Quality Score Competitor Finder
Demo Tools - Brand Safety
Brand Safety Checker Brand Suitability Quality Checker
Demo Tools - Content
Sentiment Analyzer Context Aware Ads
Resources
API Documentation Pricing Login
Try Categorization

The Policy Engine Every AI Agent Web Browser Needs

An AI agent without a policy engine is a liability in production. Every URL navigation, every form interaction, every data retrieval must pass through a deterministic decision layer that evaluates the target against your organizational rules. Our 102 million domain database provides the structured categorization data that transforms a policy engine from a theoretical framework into a production-grade control plane.

102M
Classified Domains
3-Way
Decision Logic
<1ms
Lookup Latency
20+
Page Types

The Problem: Agents Make Decisions Without a Framework

Without a policy engine, every navigation decision is made by the LLM itself — a non-deterministic system that cannot guarantee consistent, auditable outcomes.

LLM-Based Decision Making Is Not a Policy Engine

When organizations deploy browser-using agents without a dedicated policy engine, the LLM becomes the de facto decision maker for every navigation event. The agent's prompt might say "don't visit dangerous websites," but the LLM has no structured mechanism to evaluate what constitutes "dangerous." It relies on its training data, which may be outdated, and its inference, which is inherently probabilistic. The result is a decision system that is neither consistent, nor auditable, nor fast enough for production workloads.

  • Non-deterministic outcomes: The same URL evaluated twice by the same LLM may produce different decisions, making it impossible to enforce consistent policies across agent instances
  • Latency penalty: Every policy evaluation requires a model inference call, adding 500ms to 2 seconds to each navigation decision — unacceptable for agents processing hundreds of URLs per session
  • Audit failure: An LLM's reasoning for why it allowed or blocked a URL is a natural language explanation, not a structured policy reference — compliance teams cannot audit or reproduce the decision
  • Prompt injection risk: Malicious website content can influence the LLM's navigation decisions through prompt injection, causing the agent to bypass its own safety guidelines

The Solution: A Deterministic Policy Engine Powered by URL Categorization

A proper policy engine operates outside the LLM entirely. It intercepts the agent's navigation intent, resolves the target URL against the 102M domain database, evaluates the returned category data against your policy rules, and returns a deterministic decision: allow, block, or escalate to human review. The LLM never participates in the policy decision — it only receives the outcome.

This architecture is fundamentally different from prompt-based filtering. The policy engine is a separate code module with defined inputs (URL, agent context) and outputs (decision, reason, audit record). It executes in microseconds, produces identical outputs for identical inputs, and generates structured audit logs that compliance teams can review, reproduce, and certify.

Policy Decision Tree

URL classification flowing through multi-tier policy evaluation

Policy Engine Architecture

Three layers that transform URL categorization into enforceable policy decisions

Classification Layer

The first layer resolves the target URL against the 102M domain database. The output is a structured record containing IAB taxonomy categories (up to 4 tiers), web filtering category, page type (login, checkout, admin, etc.), reputation score, and global popularity ranking. This classification happens in under 1ms for cached domains.

Evaluation Layer

The second layer evaluates the classification against your policy rules. Rules are expressed as category-to-action mappings: "Adult" maps to "block," "Business and Finance" maps to "allow," "Social Networking" maps to "review." Page-type rules override category rules — a login page is blocked regardless of what category the domain belongs to.

Enforcement Layer

The third layer enforces the decision by controlling the agent's HTTP client. An "allow" decision permits the navigation. A "block" decision prevents the HTTP request from firing and returns a policy violation message to the agent. A "review" decision queues the request for human approval and pauses the agent's workflow until the review is complete.

Policy Engine Mechanism

Interlocking rule sets processing agent navigation requests

Policy Engine Implementation Code

Production-ready snippets for building a three-tier policy engine with URL categorization

Python — Policy Engine with Category Evaluation

import http.client import json from datetime import datetime class AgentPolicyEngine: """Three-tier policy engine: classify, evaluate, enforce. Uses URL categorization to make deterministic decisions.""" BLOCK_CATEGORIES = ["Adult", "Malware", "Phishing", "Illegal"] REVIEW_CATEGORIES = ["Gambling", "Weapons", "Drugs"] BLOCK_PAGE_TYPES = ["login", "checkout", "admin", "settings"] def __init__(self, api_key, allowed_categories=None): self.api_key = api_key self.allowed_categories = set( c.lower() for c in (allowed_categories or []) ) self.conn = http.client.HTTPSConnection( "www.websitecategorizationapi.com" ) self.audit_log = [] def classify(self, url): """Layer 1: Resolve URL to structured category data.""" payload = ( f"query={url}" f"&api_key={self.api_key}" f"&data_type=url" f"&expanded_categories=1" ) headers = { "Content-Type": "application/x-www-form-urlencoded" } self.conn.request( "POST", "/api/iab/iab_web_content_filtering.php", payload, headers ) res = self.conn.getresponse() return json.loads(res.read().decode("utf-8")) def evaluate(self, classification): """Layer 2: Apply policy rules to classification.""" categories = [ c[0].split("Category name: ")[1] for c in classification.get("iab_classification", []) ] page_type = classification.get("page_type", "unknown") # Page-type rules override everything if page_type in self.BLOCK_PAGE_TYPES: return "block", f"Page type blocked: {page_type}" # Hard-block dangerous categories for cat in categories: for blocked in self.BLOCK_CATEGORIES: if blocked.lower() in cat.lower(): return "block", f"Category blocked: {cat}" # Escalate review categories for cat in categories: for review in self.REVIEW_CATEGORIES: if review.lower() in cat.lower(): return "review", f"Category flagged: {cat}" return "allow", "Within policy boundaries" def enforce(self, url): """Layer 3: Full pipeline — classify, evaluate, log.""" classification = self.classify(url) decision, reason = self.evaluate(classification) audit_entry = { "timestamp": datetime.utcnow().isoformat(), "url": url, "decision": decision, "reason": reason } self.audit_log.append(audit_entry) return decision, reason # Deploy the engine engine = AgentPolicyEngine(api_key="your_api_key") decision, reason = engine.enforce("https://example.com/login") print(f"Decision: {decision} — {reason}")

JavaScript — Policy Enforcement Middleware

class PolicyEnforcementMiddleware { constructor(apiKey, policyRules) { this.apiKey = apiKey; this.rules = policyRules; this.auditLog = []; } async classify(targetURL) { const response = await fetch( "https://www.websitecategorizationapi.com" + "/api/iab/iab_web_content_filtering.php", { method: "POST", headers: { "Content-Type": "application/x-www-form-urlencoded" }, body: new URLSearchParams({ query: targetURL, api_key: this.apiKey, data_type: "url", expanded_categories: "1" }) } ); return await response.json(); } evaluate(classification) { const pageType = classification.page_type || "unknown"; const filterCat = classification.filtering_taxonomy?.[0]?.[0] ?.replace("Category name: ", "") || "Unknown"; if (this.rules.blockedPageTypes.includes(pageType)) { return { action: "block", reason: `Page type: ${pageType}` }; } if (this.rules.blockedCategories.includes(filterCat)) { return { action: "block", reason: `Category: ${filterCat}` }; } if (this.rules.reviewCategories?.includes(filterCat)) { return { action: "review", reason: `Review: ${filterCat}` }; } return { action: "allow", reason: "Within policy" }; } async enforce(targetURL) { const data = await this.classify(targetURL); const decision = this.evaluate(data); this.auditLog.push({ url: targetURL, ...decision, timestamp: new Date().toISOString() }); return decision; } }

Policy Rule Matrix

700+ categories mapped to allow/block/review decisions

AI Agent Database Pricing

Purpose-built domain databases for AI agent filtering. Includes IAB categories, 20+ page types, reputation scores, and popularity rankings. One-time purchase with perpetual license.

AI Agent Database
AI Agent Domain Database 10M
$7,999

10 Million Domains with Page-Type Intelligence

One-time purchase: Perpetual license  |  Optional Updates: $1,599/year

  • 10M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global Popularity Rankings
Popular
AI Agent Domain Database 20M
$14,999

20 Million Domains with Full Intelligence Suite

One-time purchase: Perpetual license  |  Optional Updates: $2,999/year

  • 20M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global & Country Rankings
  • Dedicated Account Manager
Maximum Coverage
AI Agent Domain Database 50M
$24,999

50 Million Domains with Complete Intelligence Suite

One-time purchase: Perpetual license  |  Optional Updates: $4,999/year

  • 50M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global & Country Rankings
  • Dedicated Account Manager

Also available: Enterprise URL Database up to 102M domains from $2,499. View all database tiers →

How Many Domains in Each Category?

Search any IAB or Web Filtering category to see how many domains are in our 102M Enterprise Database — the same data your policy engine evaluates for every navigation decision.

Popular:
Database Analytics

Domain Distribution by Category in Our 102M Enterprise Database

How 102 million domains from our main Enterprise Database are distributed across IAB v3 taxonomy classifications

Top 50 IAB v3 Categories

Spanning Tier 1 through Tier 4 classifications from our 102M Enterprise Database

IAB v3

Charts display domain counts for the top 50 out of 700+ categories in our 102M Enterprise Database. To check the number of domains for the remaining 650+ categories, use the Category Counter tool above .

Three-Tier Evaluation Pipeline

Classify, evaluate, enforce — the complete policy decision chain

Anatomy of a Production Policy Engine for Agent Browsing

A policy engine is the core decision-making component of any agent governance architecture. It sits between the agent's intent to navigate and the actual HTTP request, intercepting every URL before the agent can reach it. The engine does not decide what the agent should do — the LLM handles that. The engine decides what the agent is allowed to do, based on structured data and deterministic rules. This separation of intelligence (LLM) from governance (policy engine) is the architectural principle that makes production agent deployments viable.

Without this separation, organizations face an impossible choice: either trust the LLM to make security-critical decisions (which it was not designed to do), or restrict the agent so severely that it cannot accomplish its tasks. A policy engine resolves this tension by offloading security decisions to a purpose-built component that operates on structured data rather than probabilistic inference.

The Three-Decision Model: Allow, Block, Review

Binary allow/block decisions are insufficient for real-world agent deployments. Some URLs fall into ambiguous categories that require human judgment. A domain categorized as "News" might host content ranging from financial analysis (clearly within scope for a finance agent) to conspiracy theories (clearly outside scope). A binary system forces you to either allow all News domains or block all of them, neither of which is appropriate.

The three-decision model adds a "review" path: URLs that are not clearly allowed or blocked are queued for human evaluation. The agent's workflow pauses on the specific URL in question while a human analyst reviews the classification data and makes the final call. The review decision is logged, and if the same domain is encountered again, the previous human decision is cached and applied automatically.

Rule Priority and Override Chains

Policy rules are not all equal. A page-type rule ("block all login pages") should override a category rule ("allow Business and Finance"). A domain-level exception ("always allow example.com") should override both. A well-designed policy engine implements a clear priority chain: domain exceptions take precedence over page-type rules, which take precedence over category rules, which take precedence over the default action.

This priority chain ensures that the policy engine can handle edge cases without requiring increasingly complex rule definitions. Instead of writing a single monolithic rule that accounts for every scenario, you write simple rules at different priority levels and let the engine resolve conflicts deterministically.

Contextual Policy Rules Based on Agent State

Advanced policy engines evaluate rules based not just on the target URL but on the agent's current state. An agent performing a financial research task might be allowed to visit financial news sites but not social media. If the same agent is reassigned to a marketing task, its policy profile changes and social media access is permitted. This contextual evaluation requires the policy engine to maintain awareness of the agent's current task assignment and apply the corresponding rule set.

The URL categorization database provides the foundational data that makes contextual rules practical. Without pre-computed categories, every contextual policy evaluation would require a model inference call to determine whether the target URL is "relevant to the agent's current task." With the database, the evaluation is a simple set-membership check: is this domain's category in the current task's approved category list?

Audit Logging and Compliance Evidence

Every policy decision must produce a structured audit record containing the timestamp, the requesting agent instance, the target URL, the resolved category data, the applied rule, and the decision outcome. These records serve three purposes: operational debugging (why was this URL blocked?), security investigation (what did this agent access during the incident window?), and compliance evidence (prove that your agents operated within policy boundaries).

The audit log format should be machine-readable (JSON or structured log format) and should include enough context to reproduce the decision. A compliance auditor should be able to take any audit record, look up the same URL in the categorization database, apply the same policy rules, and arrive at the same decision. This reproducibility is the hallmark of a deterministic policy engine.

Policy Engine Performance at Scale

The policy engine sits in the critical path of every agent navigation event. If the engine adds 2 seconds of latency to each URL visit, and the agent visits 500 URLs per session, you have added 16 minutes of overhead to each agent session. This is why the policy engine must operate at sub-millisecond speeds for the vast majority of lookups.

With the 102M database loaded into Redis, each category lookup completes in under 1ms. The policy rule evaluation — comparing the returned category against the rule set — adds negligible overhead. The total end-to-end latency from navigation intent to policy decision is under 2ms for cached domains. For the 0.5% of domains not in the local database, the API fallback adds approximately 200ms — still fast enough to be imperceptible in the agent's workflow.

Integration with Existing Security Infrastructure

The policy engine should not operate in isolation. It should integrate with your existing security infrastructure: SIEM systems for centralized logging, SOAR platforms for automated incident response, and identity management systems for agent authentication. When the policy engine blocks a URL, the block event should appear in your SIEM alongside firewall blocks and proxy denials. When a pattern of suspicious URL access is detected, the SOAR platform should automatically restrict the agent's permissions.

This integration is straightforward because the policy engine produces structured data in standard formats. The audit log entries can be streamed to any SIEM via syslog, HTTP webhooks, or message queues. The policy rules can be managed through your existing configuration management tools — Terraform, Ansible, or your CI/CD pipeline.

From Prototype to Production: The Policy Engine Maturity Model

Most organizations start with a simple blocklist — a flat file of domains the agent cannot visit. This is level one. Level two adds category-based rules using the URL categorization database. Level three introduces page-type awareness and contextual rules based on agent state. Level four integrates with the organizational security infrastructure and implements automated anomaly detection. Level five achieves full policy-as-code, where policy rules are version-controlled, peer-reviewed, and deployed through CI/CD pipelines just like application code.

The 102M domain database supports every level of this maturity model. At level one, it provides the category data that evolves your blocklist into a category-based rule set. At level five, it serves as the data layer in a fully automated policy-as-code pipeline. The database grows with your governance ambitions — you never outgrow it.

Policy Engine Control Panel

Real-time monitoring of allow/block/review decisions across the agent fleet

Build Your Agent Policy Engine Today

Deploy URL categorization as the data layer for your agent policy engine. Deterministic decisions, structured audit trails, sub-millisecond performance.

View AI Agent Database View 102M Enterprise Database
Stay in the loop

You are on the list!

We will send you updates that matter — no spam.