WebsiteCategorizationAPI
Home
Demo Tools - Categorization
Website Categorization Text Classification URL Database Taxonomy Mapper
Demo Tools - Website Intel
Technology Detector Quality Score Competitor Finder
Demo Tools - Brand Safety
Brand Safety Checker Brand Suitability Quality Checker
Demo Tools - Content
Sentiment Analyzer Context Aware Ads
Resources
API Documentation Pricing Login
Try Categorization

Finding the Right Domain Taxonomy Provider for AI Agent Policy

AI agents need structured, hierarchical domain classifications to enforce web browsing policies. The taxonomy provider you choose determines whether your agent governance is granular and auditable or coarse and brittle. Here is what to evaluate, why IAB taxonomy has become the industry standard, and how to integrate taxonomy data into your agent policy engine.

700+
IAB Categories
4
Taxonomy Tiers
102M
Classified Domains
20+
Page Types

The Problem: Flat Taxonomies Create Flat Policies

Most domain classification providers offer shallow, single-tier category lists that cannot support the nuanced policy rules that enterprise AI agent deployments require.

Why Generic Classification Falls Short for Agent Governance

When teams evaluate domain taxonomy providers for AI agent policy, they quickly discover that most classification vendors were built for advertising, not security. Their taxonomies are designed to place ads next to relevant content, not to enforce granular allow/block decisions on autonomous agents. A taxonomy that groups "Banking" and "Cryptocurrency" under a single "Finance" label cannot distinguish between a corporate treasury research task and a speculative trading site visit. The result is policies that are either too permissive (allowing agents onto risky pages) or too restrictive (blocking legitimate research destinations).

  • Single-tier categories: Flat lists with 30 to 50 categories force you into binary allow/block decisions with no granularity between "Finance" and "Cryptocurrency Trading Platforms"
  • No page-type metadata: Category alone does not tell you whether the agent is about to visit a login page, a checkout flow, or a public documentation site within the same domain
  • Inconsistent labeling: Providers with proprietary taxonomies create vendor lock-in — switching providers means rewriting every policy rule from scratch
  • Sparse coverage: Many providers classify only the top 1 to 5 million domains, leaving 95%+ of the internet as "Unknown" in your policy engine

The Solution: IAB Content Taxonomy as the Standard for Agent Policy

The IAB (Interactive Advertising Bureau) Content Taxonomy v3 is a hierarchical, open-standard classification system with 700+ categories organized across four tiers. Unlike proprietary classification schemes, IAB taxonomy is maintained by an industry consortium, documented publicly, and adopted across thousands of platforms. When you build agent policy rules on IAB taxonomy, you are building on a stable, interoperable foundation that will not break when you switch vendors or upgrade your agent infrastructure.

Our database applies IAB taxonomy to 102 million domains and enriches each entry with web filtering categories, 20+ page-type labels, reputation scores, and popularity rankings. This means your agent policy engine can operate at any level of granularity — from broad Tier 1 rules ("block all Adult content") to surgical Tier 4 rules ("allow Financial Services > Banking > Commercial Banking but block Financial Services > Investing > Cryptocurrency") — all using a standardized, portable vocabulary.

Taxonomy Hierarchy Visualization

IAB v3 categories branching across four tiers of classification depth

What to Evaluate in a Domain Taxonomy Provider

Six critical dimensions that separate taxonomy providers built for agent governance from those built for advertising

Taxonomy Depth and Granularity

A four-tier taxonomy with 700+ categories lets you write policies that distinguish between "Technology > Computing > Cloud Computing > Infrastructure as a Service" and a generic "Technology" label. Depth enables precision. Flat taxonomies force you to over-block or under-filter. Ask every provider: how many tiers does your taxonomy have, and how many leaf-node categories exist at the deepest level?

Domain Coverage Volume

An agent filtering database is only useful if it covers the domains your agents actually visit. Providers with 1 to 10 million domains leave massive gaps. Our 102 million domain database covers 99.5% of the active internet measured by the Google Chrome User Experience Report. Every "Unknown" result in your policy engine is a decision you have to make without data — and in security, that default is usually "block," which kills agent productivity.

Page-Type Metadata

Domain-level categories answer "what is this site about?" but page-type metadata answers "what is this specific page designed to do?" A domain categorized as "Business and Finance" could serve a public investor relations page, a login portal, an admin dashboard, or a checkout flow. Page-type labels — login, checkout, settings, admin, careers, pricing, documentation — give your policy engine the granularity to block dangerous page types while allowing benign ones on the same domain.

Open vs. Proprietary Taxonomy

Proprietary taxonomies create vendor lock-in. If your taxonomy provider uses custom category names like "BIZ_FIN_003" instead of the IAB standard "Business and Finance > Financial Services > Banking," then every policy rule, every audit report, and every compliance mapping is tied to that specific vendor. IAB taxonomy is an open standard — you can switch providers, merge datasets, or build custom overlays without rewriting your policy logic.

Update Frequency and Freshness

The internet changes constantly. Approximately 50,000 new domains are registered every day, and existing domains shift content regularly. A taxonomy provider that updates quarterly will have stale classifications for millions of domains. Look for providers that offer quarterly database refreshes at minimum, with a real-time API fallback for domains not yet in the offline database. Our database offers quarterly updates with optional annual refresh subscriptions.

Delivery Format and Integration Ease

How the taxonomy data reaches your agent stack matters. Providers that only offer API-based access introduce latency and external dependencies into your agent's decision path. A bulk database download — CSV, JSON, or SQL dump — lets you load the data into your own infrastructure and query it locally in sub-millisecond time. Our database ships as a downloadable file you can ingest into Redis, PostgreSQL, SQLite, DynamoDB, or any key-value store.

Provider Evaluation Dimensions

Comparing taxonomy depth, coverage, page types, freshness, and delivery

Integrating Taxonomy Data into Agent Policy Rules

Production-ready code showing how IAB taxonomy tiers map to granular agent governance decisions

Python — Tiered Taxonomy Policy Engine

import http.client import json class TaxonomyPolicyEngine: """Maps IAB taxonomy tiers to agent policy actions.""" # Tier 1 hard blocks — entire verticals off-limits TIER1_BLOCKED = [ "Adult Content", "Illegal Content", "Sensitive Topics", "Arms & Ammunition" ] # Tier 2 conditional rules — granular allow/block TIER2_RULES = { "Financial Services > Cryptocurrency": "block", "Financial Services > Banking": "allow", "Technology & Computing > Hacking": "block", "Health & Fitness > Pharmaceuticals": "review", } # Page types that override category decisions BLOCKED_PAGE_TYPES = [ "login", "checkout", "admin", "settings" ] def __init__(self, api_key): self.api_key = api_key self.conn = http.client.HTTPSConnection( "www.websitecategorizationapi.com" ) def classify(self, url): payload = ( f"query={url}" f"&api_key={self.api_key}" f"&data_type=url" f"&expanded_categories=1" ) headers = { "Content-Type": "application/x-www-form-urlencoded" } self.conn.request( "POST", "/api/iab/iab_web_content_filtering.php", payload, headers ) res = self.conn.getresponse() return json.loads(res.read().decode("utf-8")) def evaluate_policy(self, url): data = self.classify(url) page_type = data.get("page_type", "unknown") # Page-type override: block dangerous pages if page_type in self.BLOCKED_PAGE_TYPES: return { "action": "block", "reason": f"Page type '{page_type}' is restricted", "url": url } # Extract full taxonomy path categories = data.get("iab_classification", []) for cat_entry in categories: cat_path = cat_entry[0].replace( "Category name: ", "" ) # Check Tier 1 blocks tier1 = cat_path.split(" > ")[0] if tier1 in self.TIER1_BLOCKED: return { "action": "block", "reason": f"Tier 1 category blocked: {tier1}", "url": url } # Check Tier 2 rules for rule_path, action in self.TIER2_RULES.items(): if rule_path in cat_path: return { "action": action, "reason": f"Tier 2 rule: {rule_path}", "url": url } return {"action": "allow", "reason": "No policy match", "url": url} # Usage engine = TaxonomyPolicyEngine(api_key="your_api_key") result = engine.evaluate_policy("https://example.com/trading") print(f"Decision: {result['action']} — {result['reason']}")

JavaScript — Taxonomy-Aware Navigation Guard

class TaxonomyGuard { constructor(apiKey) { this.apiKey = apiKey; this.tier1Blocked = new Set([ "Adult Content", "Illegal Content", "Malware" ]); this.tier2Rules = new Map([ ["Cryptocurrency", "block"], ["Banking", "allow"], ["Hacking", "block"], ["Pharmaceuticals", "review"] ]); } async evaluate(targetURL) { const resp = await fetch( "https://www.websitecategorizationapi.com" + "/api/iab/iab_web_content_filtering.php", { method: "POST", headers: { "Content-Type": "application/x-www-form-urlencoded" }, body: new URLSearchParams({ query: targetURL, api_key: this.apiKey, data_type: "url", expanded_categories: "1" }) } ); const data = await resp.json(); // Walk each taxonomy tier const categories = data.iab_classification || []; for (const entry of categories) { const path = entry[0].replace("Category name: ", ""); const tiers = path.split(" > "); // Tier 1 hard block if (this.tier1Blocked.has(tiers[0])) { return { action: "block", tier: 1, match: tiers[0] }; } // Tier 2+ granular rules for (const [keyword, action] of this.tier2Rules) { if (tiers.some(t => t.includes(keyword))) { return { action, tier: 2, match: keyword }; } } } return { action: "allow", tier: null, match: null }; } } // Usage in agent middleware const guard = new TaxonomyGuard("your_api_key"); const decision = await guard.evaluate("https://example.com"); if (decision.action === "block") { console.log(`Blocked at Tier ${decision.tier}: ${decision.match}`); }

Taxonomy Classification Pipeline

Domains flowing through four-tier IAB classification hierarchy

AI Agent Database Pricing

Purpose-built domain databases for AI agent filtering. Includes IAB categories, 20+ page types, reputation scores, and popularity rankings. One-time purchase with perpetual license.

AI Agent Database
AI Agent Domain Database 10M
$7,999

10 Million Domains with Page-Type Intelligence

One-time purchase: Perpetual license  |  Optional Updates: $1,599/year

  • 10M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global Popularity Rankings
Popular
AI Agent Domain Database 20M
$14,999

20 Million Domains with Full Intelligence Suite

One-time purchase: Perpetual license  |  Optional Updates: $2,999/year

  • 20M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global & Country Rankings
  • Dedicated Account Manager
Maximum Coverage
AI Agent Domain Database 50M
$24,999

50 Million Domains with Complete Intelligence Suite

One-time purchase: Perpetual license  |  Optional Updates: $4,999/year

  • 50M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global & Country Rankings
  • Dedicated Account Manager

Also available: Enterprise URL Database up to 102M domains from $2,499. View all database tiers →

How Many Domains in Each Category?

Search any IAB or Web Filtering category to see how many domains are in our 102M Enterprise Database — the same taxonomy data your agent policy rules will reference.

Popular:
Database Analytics

Domain Distribution by Category in Our 102M Enterprise Database

How 102 million domains from our main Enterprise Database are distributed across IAB v3 taxonomy classifications

Top 50 IAB v3 Categories

Spanning Tier 1 through Tier 4 classifications from our 102M Enterprise Database

IAB v3

Charts display domain counts for the top 50 out of 700+ categories in our 102M Enterprise Database. To check the number of domains for the remaining 650+ categories, use the Category Counter tool above .

Four-Tier Taxonomy Cascade

Categories branching from broad verticals to granular sub-topics

Why IAB Taxonomy Is the Standard for AI Agent Policy

When enterprise teams started deploying web-browsing AI agents in 2024 and 2025, they faced a fundamental vocabulary problem. Security policies need to reference categories — "block all Adult content," "allow Business and Finance," "flag Gambling for review" — but there was no universal agreement on what those category names should be, how they should be organized, or how deep the hierarchy should go. Teams that built policies on proprietary vendor taxonomies found themselves locked into a single classification provider, unable to switch without rewriting hundreds of policy rules.

The IAB Content Taxonomy solved this problem the same way it solved the equivalent problem in programmatic advertising a decade earlier: by establishing an open, hierarchical, industry-maintained standard that any vendor can implement and any buyer can adopt. Version 3 of the IAB taxonomy defines over 700 categories organized across four tiers of specificity. Tier 1 provides 28 broad verticals (Technology, Finance, Health, Education, etc.). Tier 2 breaks those into 150+ sub-verticals. Tiers 3 and 4 add increasingly specific sub-categories, enabling policy rules that can distinguish between "Business and Finance > Financial Services > Banking > Commercial Banking" and "Business and Finance > Financial Services > Investing > Cryptocurrency."

How Taxonomy Depth Translates to Policy Precision

Consider a financial services company deploying an AI agent to research competitor products. A single-tier taxonomy would classify both a competitor's public marketing site and a cryptocurrency exchange under "Finance." The policy engine has no way to allow one and block the other. With IAB v3's four-tier hierarchy, the policy engine can write rules at the exact level of granularity needed: allow "Financial Services > Banking" at Tier 3, block "Financial Services > Investing > Cryptocurrency" at Tier 4, and flag "Financial Services > Insurance" for human review at Tier 3.

This depth also enables role-based access control for agents. A compliance research agent might have access to "Legal Services" Tier 2 categories, while a marketing agent is restricted to "Advertising and Marketing" sub-categories. The taxonomy hierarchy makes these permission boundaries explicit and auditable, rather than encoded in opaque prompt instructions that can be jailbroken.

The Coverage Problem: Why Volume of Classified Domains Matters

A taxonomy is a vocabulary. A database is the dictionary that applies that vocabulary to every domain on the internet. You can have the most sophisticated taxonomy in the world, but if your provider only classifies 5 million domains, your policy engine will return "Unknown" for 95% of the URLs an agent encounters. Each "Unknown" result forces your policy engine into a default decision — block (which kills agent productivity) or allow (which defeats the purpose of filtering). Neither default is acceptable for production agent deployments.

Our database classifies 102 million domains using the IAB taxonomy, covering 99.5% of the active internet. This means that for virtually every URL your agent will encounter during normal operation, the taxonomy lookup returns a concrete classification that your policy engine can act on. The remaining 0.5% — newly registered domains, parked pages, and extremely niche sites — are handled by a real-time API fallback that classifies any URL on demand using the same IAB taxonomy, ensuring consistent policy enforcement regardless of the data source.

Page-Type Labels: The Missing Layer in Most Taxonomy Providers

Domain-level IAB categories tell you what a website is about. Page-type labels tell you what a specific page is designed to do. This distinction is critical for agent governance because the same domain can serve pages with radically different risk profiles. A domain classified as "Technology > Computing > Cloud Computing" might serve a public documentation page (safe for any agent), a login portal (dangerous — the agent could attempt authentication), a pricing page (potentially sensitive competitive intelligence), or an admin console (critical — the agent could modify settings).

Our database includes 20+ page-type labels — homepage, about, contact, pricing, careers, login, signup, checkout, settings, admin, legal, privacy, terms, blog, documentation, API, support, FAQ, forum, and product — that let your policy engine make page-level decisions, not just domain-level decisions. A policy rule like "allow Technology domains except login and admin pages" requires both taxonomy categories and page-type metadata. Most taxonomy providers offer only the first half of that equation.

Web Filtering Categories: The Security Overlay

IAB taxonomy was designed for content classification. Web filtering categories were designed for security. Our database includes both, giving your agent policy engine two complementary lenses on every domain. Web filtering categories include Malware, Phishing, Spam, Adult, Gambling, Weapons, Drugs, Hacking, and Proxy/VPN — the same categories that enterprise web proxies and CASBs use to protect human users. Extending these same categories to AI agents ensures a consistent security posture across your entire organization, whether the web session is initiated by a person or an autonomous agent.

Building Policy Rules on a Stable Taxonomy Foundation

One of the most underappreciated benefits of using IAB taxonomy for agent policy is stability. Proprietary taxonomies change at the vendor's discretion — categories get renamed, merged, or deleted, breaking every policy rule that referenced them. IAB taxonomy versions are maintained by a consortium with a formal change process. When IAB v3 replaced v2, the mapping between old and new categories was published and documented, allowing teams to migrate policy rules systematically rather than scrambling to identify what broke.

Our database includes both IAB v2 and v3 classifications for every domain, allowing teams to operate on whichever version their existing policy infrastructure uses and to plan their migration at their own pace. This dual-version approach eliminates the "rip and replace" risk that comes with single-taxonomy providers.

Reputation and Popularity Signals: Context Beyond Categories

Categories and page types answer the "what" questions. Reputation and popularity signals answer the "how trustworthy" and "how well-known" questions. Our database enriches each domain with OpenPageRank scores (domain authority on a 0-10 scale) and global popularity rankings derived from the Google Chrome User Experience Report. These signals let your policy engine add nuance to category-based decisions. A domain classified as "Business and Finance" with a PageRank of 8 and a top-10,000 global ranking is likely a major financial institution. A domain with the same category but a PageRank of 1 and no ranking data is far more likely to be a scam site, a newly registered phishing domain, or a low-quality content farm.

Common Mistakes When Selecting a Taxonomy Provider

The first mistake is evaluating providers on taxonomy size alone. A provider with 2,000 categories sounds impressive until you realize that most of those categories have fewer than 100 classified domains. Taxonomy breadth without domain coverage is an empty vocabulary. The second mistake is choosing a provider optimized for advertising use cases. Advertising taxonomies are designed to maximize ad relevance, not to enforce security policies. They often lack the security-focused categories (Malware, Phishing, Hacking) that agent policy engines need. The third mistake is ignoring delivery format. An API-only provider introduces latency and external dependencies into your agent's decision loop. For production agent deployments, you need a local database that your policy engine can query without leaving your network.

How to Migrate from a Proprietary Taxonomy to IAB

If your organization is already running agent policies on a proprietary taxonomy, migrating to IAB is a three-step process. First, create a mapping table between your existing categories and IAB v3 categories. Most proprietary taxonomies use 30 to 50 categories, making the mapping exercise manageable. Second, run both taxonomies in parallel for 30 days, comparing policy decisions to identify mismatches. Third, cut over to IAB-based rules once the parallel run confirms equivalence. Our team provides migration support, including pre-built mapping tables for the most common proprietary taxonomies used by web filtering vendors.

Policy Decision Network

Taxonomy-driven decisions across agent policy rules

Build Agent Policy on an Industry-Standard Taxonomy

Stop building agent policies on proprietary category lists. Deploy IAB taxonomy with 102 million pre-classified domains, 20+ page types, and web filtering categories — all in a single downloadable database.

View AI Agent Database View 102M Enterprise Database
Stay in the loop

You are on the list!

We will send you updates that matter — no spam.