WebsiteCategorizationAPI
Home
Demo Tools - Categorization
Website Categorization Text Classification URL Database Taxonomy Mapper
Demo Tools - Website Intel
Technology Detector Quality Score Competitor Finder
Demo Tools - Brand Safety
Brand Safety Checker Brand Suitability Quality Checker
Demo Tools - Content
Sentiment Analyzer Context Aware Ads
Resources
API Documentation Pricing Login
Try Categorization

Web Filtering for ChatGPT, Atlas, Claude, and Other Browsing Agents

Every major AI platform now ships agents that browse the web autonomously — OpenAI's ChatGPT with browsing, Google's Atlas and Mariner, Anthropic's Claude Computer Use, and dozens of open-source alternatives. Each platform has different architecture, different tool-calling conventions, and different hook points for filtering. Our 102M domain database provides the universal categorization layer that works across all of them.

102M
Classified Domains
700+
IAB Categories
All
Major Platforms
20+
Page Types

The Problem: Every Agent Platform Has Different Filtering Hooks

Building custom web filtering for each AI agent platform is unsustainable. You need a single categorization layer that works everywhere.

Platform Fragmentation Creates Governance Gaps

The enterprise AI landscape is multi-vendor by necessity. Your engineering team might use Claude Computer Use for code review automation. Your marketing team uses ChatGPT with browsing for competitive research. Your operations team deploys custom agents built on open-source frameworks like LangChain or CrewAI. Each of these platforms implements web browsing differently — different tool-calling APIs, different browser automation approaches, different levels of control over navigation events. Building bespoke filtering logic for each platform means maintaining multiple codebases, each with its own bugs, gaps, and maintenance overhead.

  • ChatGPT browsing: OpenAI's browsing tool operates behind their API — you have limited control over what URLs the model visits during a browsing session unless you wrap the call with external filtering
  • Claude Computer Use: Anthropic's agent controls the full desktop, including browser navigation — filtering must intercept at the OS or proxy level to cover all possible navigation paths
  • Google Atlas/Mariner: Google's agent platform integrates with Chrome — filtering options include Chrome extensions, proxy configuration, or API-level interception
  • Custom agents: Playwright, Puppeteer, or Selenium-based agents need middleware-level filtering in the code that drives the browser

The Solution: A Universal Categorization Layer That Works Across All Platforms

Instead of building filtering logic for each platform separately, deploy a single URL categorization service that any agent — regardless of platform — can call before navigation. The service takes a URL as input and returns the IAB category, page type, web filtering classification, and reputation score. Your per-platform integration code is minimal: just the hook to intercept the navigation intent and call the categorization service. The heavy lifting — classifying 102 million domains across 700+ categories and 20+ page types — is handled by the shared service.

This architecture decouples the "what to filter" decision (category rules) from the "how to filter" implementation (per-platform hooks). When you add a new blocked category, every agent across every platform immediately enforces it. When you onboard a new agent platform, you write a thin integration layer that calls the same service — no need to rebuild the categorization engine.

Multi-Platform Agent Filtering

One categorization layer serving ChatGPT, Claude, Atlas, and custom agents

Platform-Specific Integration Patterns

How to connect the categorization layer to each major AI agent platform

ChatGPT / OpenAI Agents

For OpenAI's function-calling agents, wrap the browsing tool with a pre-navigation filter. When the agent calls the browse function, your middleware extracts the target URL, queries the categorization API, and either passes the request through or returns a "blocked" tool response. For Operator-style agents, deploy a proxy that intercepts all outbound HTTP traffic from the agent container and performs classification at the network level.

Claude Computer Use

Anthropic's Computer Use agent controls the full desktop environment, which means it can navigate via any browser on the system. The most reliable filtering approach is a transparent proxy or DNS-level filter that intercepts all outbound web traffic from the agent's VM or container. Load the 102M database into the proxy for sub-millisecond classification. Every URL the agent requests — regardless of which browser or method it uses — passes through the filter.

Google Atlas / Mariner

Google's agent platform integrates tightly with Chrome. Filtering options include a Chrome extension that intercepts navigation events and checks them against the categorization database before allowing page load, a proxy configuration in the agent's Chrome profile, or an API-level tool wrapper similar to the OpenAI pattern. The Chrome extension approach provides the richest integration — it can access the full URL including path and query parameters for precise page-type matching.

Centralized Classification Hub

All agent platforms query the same 102M domain database

Cross-Platform Filtering Code

Universal categorization service with platform-specific thin wrappers

Python — Universal Agent Filter Service

import http.client import json class UniversalAgentFilter: """Single filtering service that works with ChatGPT, Claude, Atlas, and any custom browsing agent.""" POLICY_RULES = { "blocked_categories": [ "Adult", "Malware", "Phishing", "Gambling", "Illegal Content", "Weapons", "Drugs" ], "blocked_page_types": [ "login", "checkout", "admin", "settings" ], "min_reputation": 2 } def __init__(self, api_key): self.api_key = api_key self.conn = http.client.HTTPSConnection( "www.websitecategorizationapi.com" ) def classify_and_filter(self, url, agent_platform="unknown"): """Platform-agnostic classification and filtering.""" payload = ( f"query={url}" f"&api_key={self.api_key}" f"&data_type=url" f"&expanded_categories=1" ) headers = { "Content-Type": "application/x-www-form-urlencoded" } self.conn.request( "POST", "/api/iab/iab_web_content_filtering.php", payload, headers ) res = self.conn.getresponse() data = json.loads(res.read().decode("utf-8")) categories = [ c[0].split("Category name: ")[1] for c in data.get("iab_classification", []) ] page_type = data.get("page_type", "unknown") reputation = data.get("open_page_rank", 0) result = { "url": url, "agent_platform": agent_platform, "categories": categories, "page_type": page_type, "reputation": reputation, "action": "allow" } # Check against universal policy for cat in categories: for blocked in self.POLICY_RULES["blocked_categories"]: if blocked.lower() in cat.lower(): result["action"] = "block" result["reason"] = f"Category: {cat}" return result if page_type in self.POLICY_RULES["blocked_page_types"]: result["action"] = "block" result["reason"] = f"Page type: {page_type}" return result # Works identically for all platforms filter_svc = UniversalAgentFilter(api_key="your_key") # ChatGPT agent call chatgpt_result = filter_svc.classify_and_filter( "https://example.com/pricing", agent_platform="chatgpt" ) # Claude Computer Use call claude_result = filter_svc.classify_and_filter( "https://bank.com/login", agent_platform="claude" ) # Custom agent call custom_result = filter_svc.classify_and_filter( "https://research.edu/papers", agent_platform="custom" )

JavaScript — Multi-Platform Filter Middleware

class MultiPlatformFilter { constructor(apiKey, policyConfig) { this.apiKey = apiKey; this.policy = policyConfig; this.stats = { chatgpt: 0, claude: 0, atlas: 0, custom: 0 }; } async filter(targetURL, platform = "custom") { const response = await fetch( "https://www.websitecategorizationapi.com" + "/api/iab/iab_web_content_filtering.php", { method: "POST", headers: { "Content-Type": "application/x-www-form-urlencoded" }, body: new URLSearchParams({ query: targetURL, api_key: this.apiKey, data_type: "url", expanded_categories: "1" }) } ); const data = await response.json(); const filterCat = data.filtering_taxonomy?.[0]?.[0] ?.replace("Category name: ", "") || "Unknown"; const pageType = data.page_type || "unknown"; this.stats[platform] = (this.stats[platform] || 0) + 1; if (this.policy.blockedCategories.includes(filterCat)) { return { action: "block", reason: `Category: ${filterCat}`, platform, url: targetURL }; } if (this.policy.blockedPageTypes.includes(pageType)) { return { action: "block", reason: `Page type: ${pageType}`, platform, url: targetURL }; } return { action: "allow", platform, url: targetURL }; } }

Cross-Platform Data Flow

URLs from every agent platform classified through a single pipeline

AI Agent Database Pricing

Purpose-built domain databases for AI agent filtering. Includes IAB categories, 20+ page types, reputation scores, and popularity rankings. One-time purchase with perpetual license.

AI Agent Database
AI Agent Domain Database 10M
$7,999

10 Million Domains with Page-Type Intelligence

One-time purchase: Perpetual license  |  Optional Updates: $1,599/year

  • 10M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global Popularity Rankings
Popular
AI Agent Domain Database 20M
$14,999

20 Million Domains with Full Intelligence Suite

One-time purchase: Perpetual license  |  Optional Updates: $2,999/year

  • 20M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global & Country Rankings
  • Dedicated Account Manager
Maximum Coverage
AI Agent Domain Database 50M
$24,999

50 Million Domains with Complete Intelligence Suite

One-time purchase: Perpetual license  |  Optional Updates: $4,999/year

  • 50M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global & Country Rankings
  • Dedicated Account Manager

Also available: Enterprise URL Database up to 102M domains from $2,499. View all database tiers →

How Many Domains in Each Category?

Search any IAB or Web Filtering category to see domain coverage across all agent platforms.

Popular:
Database Analytics

Domain Distribution by Category in Our 102M Enterprise Database

How 102 million domains from our main Enterprise Database are distributed across IAB v3 taxonomy classifications

Top 50 IAB v3 Categories

Spanning Tier 1 through Tier 4 classifications from our 102M Enterprise Database

IAB v3

Charts display domain counts for the top 50 out of 700+ categories in our 102M Enterprise Database. To check the number of domains for the remaining 650+ categories, use the Category Counter tool above .

Agent Ecosystem Map

ChatGPT, Claude, Atlas, and custom agents in a unified governance framework

Deep Dive: Filtering Across Every Major AI Agent Platform

The proliferation of AI agents that can browse the web is accelerating faster than the governance tooling to control them. In 2024, OpenAI launched ChatGPT with browsing and Operator. Anthropic shipped Claude Computer Use. Google introduced Project Mariner and Atlas. Microsoft integrated Copilot agents. And the open-source community built thousands of custom agents on LangChain, CrewAI, AutoGen, and similar frameworks. Each of these agents can navigate to any URL on the public internet, and each has a different technical architecture that determines how — and whether — you can filter their web access.

The common denominator across all platforms is the URL. Regardless of how the agent initiates navigation — through a tool call, a browser API, a proxy connection, or direct HTTP — the target URL is always available at some point in the execution path. URL categorization exploits this invariant: classify the URL, apply your policy rules, and make an allow/block decision. The classification data is platform-agnostic; only the hook point for intercepting the URL differs per platform.

OpenAI ChatGPT and Operator: Filtering at the API Boundary

OpenAI's browsing agents operate within OpenAI's infrastructure. When you use ChatGPT with browsing enabled, the model's web requests are processed on OpenAI's servers. For API-based integrations (using the Assistants API or function calling), you have more control. The recommended pattern is to wrap the browsing tool with a categorization check: when the model calls the "browser" tool with a target URL, your middleware intercepts the call, classifies the URL against the 102M database, evaluates the category against your policy, and either forwards the request or returns a "blocked" response. This prevents the model from ever seeing the page content for blocked URLs.

For Operator-style agents that run in OpenAI's managed environment, the filtering must happen at a different layer. The most effective approach is to configure a web proxy for the agent's environment and load the categorization database into the proxy. All outbound HTTP requests from the agent pass through the proxy, where they are classified and filtered before reaching the target server. This provides transparent filtering without modifying the agent's code or OpenAI's platform.

Anthropic Claude Computer Use: Filtering at the Desktop Level

Claude Computer Use is architecturally distinct from other agent platforms because it controls a full desktop environment rather than a single browser tab. The agent can open any browser, type URLs into the address bar, click links in any application, and navigate the web through any path available to a human user. This makes middleware-level filtering insufficient — the agent might bypass a filtered browser by opening a different one.

The robust solution is network-level filtering. Route all traffic from the agent's VM or container through a transparent proxy that classifies every outbound HTTP/HTTPS request. The proxy loads the 102M domain database into memory and evaluates each domain against the organization's policy before forwarding or blocking the request. This works regardless of which browser or application the agent uses to access the web, because all web traffic must transit the network layer.

Google Atlas and Mariner: Filtering via Chrome Integration

Google's agent platform leverages Chrome as its browsing engine, which opens up Chrome-native filtering options. A Chrome extension can intercept all navigation events (using the chrome.webNavigation API), classify the target URL against a local copy of the categorization database, and block the navigation before the page loads. This approach is lightweight, runs in-browser, and provides sub-millisecond filtering for database-cached domains. For domains not in the local cache, the extension can call the categorization API as a fallback.

The Chrome extension approach has the additional advantage of accessing the full URL path and query parameters, enabling precise page-type matching. While domain-level classification catches the majority of filtering needs, path-level detection is necessary for cases like blocking example.com/admin while allowing example.com/docs. The extension can perform this path-level analysis locally by combining the domain classification from the database with URL path pattern matching.

Custom Agents: LangChain, CrewAI, AutoGen, and DIY Frameworks

Custom agents built on open-source frameworks offer the most flexibility for filtering integration because you control the entire codebase. The recommended pattern varies by framework. In LangChain, create a custom Tool that wraps the WebBrowser tool and performs a categorization check before each navigation. In CrewAI, register a pre-task hook that intercepts the browsing agent's web actions. In AutoGen, add a function-call tool that the agent must invoke before any URL visit. In Playwright or Selenium-based custom agents, override the navigation function (page.goto in Playwright, driver.get in Selenium) to include the categorization check.

The advantage of custom agents is that the filtering can be deeply integrated into the agent's decision-making loop. Instead of just blocking a URL, the filter can return the categorization data to the agent, allowing it to make informed decisions. For example, the agent might receive "this URL is categorized as Financial Services > Banking, page type: login" and adjust its behavior accordingly — choosing to avoid the login page and instead look for the bank's public product information page.

Unified Policy Management Across All Platforms

The highest-value outcome of a universal categorization layer is unified policy management. Define your filtering rules once — which categories are blocked, which page types are denied, what reputation threshold is required — and apply them consistently across ChatGPT agents, Claude Computer Use sessions, Atlas browsers, and every custom agent in your fleet. When compliance requirements change (a new regulation restricts agent access to healthcare sites), update the policy once and every platform enforces it immediately.

Without a shared categorization layer, policy changes require coordinated updates across multiple systems, each with its own configuration format, deployment process, and validation procedure. A rule change that should take minutes instead takes days, and the window between updating one platform and the others creates a period of inconsistent enforcement that auditors will flag.

Multi-Platform Audit and Compliance Reporting

When all agent platforms share a single categorization service, the audit trail is unified. Every filtering decision — across ChatGPT, Claude, Atlas, and custom agents — flows into a single log with consistent fields: timestamp, agent identity, platform, target URL, classification result, policy rule, and action taken. This unified log enables cross-platform compliance reporting: how many URLs were blocked across all agents this week, which categories were most frequently denied, which agents triggered the most policy violations, and which platforms generated the most web traffic.

For organizations subject to SOC 2, ISO 27001, or industry-specific regulations (HIPAA, PCI DSS, GDPR), the unified audit trail simplifies the evidence-gathering process dramatically. Instead of collecting logs from five different systems, normalizing the formats, and reconciling the timestamps, the compliance team queries a single data source that covers all agent activity across all platforms.

Unified Audit Trail

All agent platforms feeding into a single governance dashboard

Filter Every Agent Platform With One Database

Deploy unified web filtering across ChatGPT, Claude Computer Use, Atlas, and custom agents. 102 million domains, one integration, every platform covered.

View AI Agent Database View 102M Enterprise Database
Stay in the loop

You are on the list!

We will send you updates that matter — no spam.