Network firewalls protect your infrastructure from external threats. An AI agent firewall protects the external web from your agents — and protects your organization from the consequences of uncontrolled agent navigation. Our 102 million domain database provides the category intelligence that powers deep packet inspection for AI agent HTTP traffic, filtering every request by IAB category, page type, and domain reputation before it leaves your network.
Traditional firewalls inspect inbound traffic to protect your network. But AI agents generate outbound traffic — and no standard firewall understands the difference between a product page and a payment portal.
When an AI agent browses the web, it generates outbound HTTP requests from your network to external destinations. Traditional network firewalls and web proxies were designed to control human browsing patterns — a few hundred page views per day per user, with URLs that follow predictable patterns. An AI agent can generate thousands of requests per hour, following link chains across domains that no human would visit, accessing pages in sequences that no human would follow.
A category-aware firewall sits between your AI agents and the internet, intercepting every outbound HTTP request and resolving the destination URL against the 102M domain database. The firewall evaluates the resolved category, page type, and reputation score against your firewall rules and makes a deterministic allow/block decision before the request reaches the public internet. Unlike a network firewall that operates at layer 3-4, this agent firewall operates at layer 7 with full content-category awareness.
The firewall enforces the same security posture on your AI agents that web proxies enforce on your employees — but with agent-specific intelligence. It knows which agent is making the request, what task the agent is performing, and what category of site the agent is trying to reach. This context enables fine-grained rules like "Agent-Finance can access Business and Finance domains but cannot visit any page of type login or checkout."
Three operational modes for filtering AI agent web traffic by site category
The firewall operates inline in the agent's HTTP client stack, intercepting requests before they are transmitted. Every URL is resolved against the local category database. Blocked categories receive an immediate rejection — the HTTP request never fires. This mode provides the highest security guarantee: no uncategorized traffic reaches the internet.
Deploy the firewall as a transparent HTTP proxy that all agent traffic routes through. The proxy resolves categories, applies rules, and forwards approved requests. This mode works with any agent framework without code changes — configure the proxy address in the agent's environment and all traffic is automatically filtered.
Start with audit-only mode to observe agent browsing patterns without blocking anything. The firewall classifies every URL and logs the category, page type, and what the policy decision would have been — but allows all traffic through. Use this data to tune your firewall rules before switching to enforcement mode.
Production-ready snippets for building a category-aware agent firewall
import http.client
import json
from datetime import datetime
class AgentFirewall:
"""Category-aware firewall that inspects and filters
all outbound HTTP requests from AI agents."""
BLOCKED_CATEGORIES = [
"Adult", "Malware", "Phishing", "Illegal Content",
"Gambling", "Weapons", "Drugs"
]
BLOCKED_PAGE_TYPES = [
"login", "checkout", "admin", "settings", "signup"
]
def __init__(self, api_key, mode="enforce"):
self.api_key = api_key
self.mode = mode # "enforce", "audit", "monitor"
self.conn = http.client.HTTPSConnection(
"www.websitecategorizationapi.com"
)
self.traffic_log = []
def inspect_request(self, target_url, agent_id="default"):
"""Inspect an outbound request against category rules."""
payload = (
f"query={target_url}"
f"&api_key={self.api_key}"
f"&data_type=url"
f"&expanded_categories=1"
)
headers = {
"Content-Type": "application/x-www-form-urlencoded"
}
self.conn.request(
"POST",
"/api/iab/iab_web_content_filtering.php",
payload,
headers
)
res = self.conn.getresponse()
data = json.loads(res.read().decode("utf-8"))
categories = [
c[0].split("Category name: ")[1]
for c in data.get("iab_classification", [])
]
page_type = data.get("page_type", "unknown")
verdict = self._apply_rules(categories, page_type)
log_entry = {
"timestamp": datetime.utcnow().isoformat(),
"agent_id": agent_id,
"url": target_url,
"categories": categories,
"page_type": page_type,
"verdict": verdict,
"mode": self.mode
}
self.traffic_log.append(log_entry)
if self.mode == "audit":
return "allow", f"Audit: would {verdict}"
return verdict, f"Firewall: {verdict}"
def _apply_rules(self, categories, page_type):
if page_type in self.BLOCKED_PAGE_TYPES:
return "block"
for cat in categories:
for blocked in self.BLOCKED_CATEGORIES:
if blocked.lower() in cat.lower():
return "block"
return "allow"
# Deploy the firewall
fw = AgentFirewall(api_key="your_api_key", mode="enforce")
verdict, msg = fw.inspect_request(
"https://example.com/checkout",
agent_id="research-agent-01"
)
print(f"Firewall verdict: {verdict} — {msg}")
class FirewallProxy {
constructor(apiKey, rules = {}) {
this.apiKey = apiKey;
this.blockedCategories = rules.blockedCategories || [
"Adult", "Malware", "Gambling", "Phishing"
];
this.blockedPageTypes = rules.blockedPageTypes || [
"login", "checkout", "admin"
];
this.inspectionLog = [];
}
async inspectOutbound(targetURL, agentContext = {}) {
const response = await fetch(
"https://www.websitecategorizationapi.com" +
"/api/iab/iab_web_content_filtering.php",
{
method: "POST",
headers: {
"Content-Type": "application/x-www-form-urlencoded"
},
body: new URLSearchParams({
query: targetURL,
api_key: this.apiKey,
data_type: "url",
expanded_categories: "1"
})
}
);
const data = await response.json();
const filterCat =
data.filtering_taxonomy?.[0]?.[0]
?.replace("Category name: ", "") || "Unknown";
const pageType = data.page_type || "unknown";
const blocked =
this.blockedCategories.includes(filterCat) ||
this.blockedPageTypes.includes(pageType);
const result = {
url: targetURL,
category: filterCat,
pageType,
verdict: blocked ? "block" : "allow",
agent: agentContext.agentId || "unknown",
timestamp: new Date().toISOString()
};
this.inspectionLog.push(result);
return result;
}
}
Purpose-built domain databases for AI agent filtering. Includes IAB categories, 20+ page types, reputation scores, and popularity rankings. One-time purchase with perpetual license.
10 Million Domains with Page-Type Intelligence
One-time purchase: Perpetual license | Optional Updates: $1,599/year
20 Million Domains with Full Intelligence Suite
One-time purchase: Perpetual license | Optional Updates: $2,999/year
50 Million Domains with Complete Intelligence Suite
One-time purchase: Perpetual license | Optional Updates: $4,999/year
Also available: Enterprise URL Database up to 102M domains from $2,499. View all database tiers →
Search any IAB or Web Filtering category to see how many domains are in our 102M Enterprise Database — the data layer powering your agent firewall rules.
How 102 million domains from our main Enterprise Database are distributed across IAB v3 taxonomy classifications
Spanning Tier 1 through Tier 4 classifications from our 102M Enterprise Database
Charts display domain counts for the top 50 out of 700+ categories in our 102M Enterprise Database. To check the number of domains for the remaining 650+ categories, use the Category Counter tool above .
The concept of a firewall is well understood in network security: a barrier between a trusted internal network and the untrusted external internet, applying rules to determine which traffic flows in each direction. AI agent firewalls apply the same concept to a new traffic type — autonomous agent HTTP requests. But unlike network firewalls that operate at the packet level, agent firewalls operate at the semantic level: they understand what the destination is, not just where it is.
This semantic awareness is what makes a category-aware firewall fundamentally different from a traditional web proxy or URL filter. A traditional proxy might block a URL because it appears on a known-bad list. An agent firewall blocks a URL because it belongs to a category that is outside the agent's approved scope — even if the specific URL has never been seen before. The firewall's intelligence comes from the 102M domain database, which provides the category context that transforms raw URLs into actionable policy data.
The agent firewall can be deployed in three architectural patterns, each with different tradeoffs. The inline pattern embeds the firewall directly in the agent's HTTP client library, intercepting requests at the code level before they reach the network stack. This provides the tightest security guarantee but requires code changes to each agent. The sidecar pattern deploys the firewall as a separate process alongside each agent instance, intercepting network traffic via iptables rules or local proxy configuration. This works with any agent framework without code changes. The proxy pattern routes all agent traffic through a centralized proxy server that applies firewall rules, providing fleet-wide visibility but introducing a potential single point of failure.
For most enterprise deployments, the sidecar pattern offers the best balance of security and operational simplicity. Each agent instance gets its own firewall process, ensuring that firewall failures are isolated to individual agents rather than affecting the entire fleet. The firewall process loads a local copy of the 102M database and applies rules independently, with no dependency on external services.
Traditional firewalls define rules based on IP addresses, ports, and protocols. These rules are necessary for network security but completely inadequate for agent governance. An IP address tells you which server the agent is connecting to, but it does not tell you whether that server hosts a financial news article (acceptable) or a cryptocurrency trading platform (potentially restricted). Two completely different websites can share the same IP address on a CDN, and a single website can be served from thousands of IP addresses across a global edge network.
Category-based rules solve this problem by operating at the content level rather than the network level. A rule like "block Adult content" does not need to enumerate every IP address that hosts adult content — it simply checks the domain's category in the database. When a new adult content site appears at a new IP address, the firewall blocks it automatically based on its category, not its IP address.
Category alone is not sufficient for a complete firewall. A domain categorized as "Technology & Computing" is generally safe for a tech research agent to visit. But if the specific page is a login page, a checkout page, or an admin panel, the agent should be blocked regardless of the domain's category. This is where page-type awareness adds the second dimension of filtering.
Our database classifies pages into 20+ types: homepage, about, contact, pricing, careers, login, signup, checkout, settings, admin, legal, privacy, terms, blog, documentation, API reference, support, FAQ, forum, and product pages. The firewall evaluates both the category rule and the page-type rule for each request. A request is only allowed if it passes both checks.
Beyond categories and page types, the 102M database includes reputation signals for each domain: OpenPageRank scores and global popularity rankings. These signals provide a third filtering dimension. A domain with a PageRank of 0 and no global ranking is likely a newly registered or rarely visited site — which correlates with higher risk for phishing, malware, or social engineering content. The firewall can incorporate reputation thresholds: allow domains with PageRank 3+ and global rank below 10 million, and route lower-reputation domains to a review queue.
Every firewall decision generates a structured log entry. These logs serve triple duty: operational monitoring (what are my agents doing right now?), security investigation (what did this agent do during the incident window?), and threat intelligence (what categories of sites are agents most frequently blocked from?). Aggregating firewall logs across all agent instances reveals patterns that inform both firewall rule refinement and broader security strategy.
For example, if firewall logs show that a specific agent instance is repeatedly attempting to access gambling sites during a financial research task, that pattern suggests either a prompt injection attack or a configuration error. Without the firewall logs, this anomalous behavior would go undetected until it caused a compliance incident.
Financial services firms operate under regulations that restrict access to specific types of content and data. An AI agent operating in a financial context must not access gambling sites, adult content, or sites associated with money laundering. Healthcare organizations must prevent agents from accessing or transmitting protected health information to unapproved destinations. Government agencies must block agent access to foreign adversary-controlled domains. The category-aware firewall maps directly to these regulatory requirements because the rules are expressed in the same vocabulary that regulators use.
Start with audit-only mode. Deploy the firewall alongside your existing agent fleet, classify every outbound request, log the categories and what the firewall decision would have been — but allow all traffic through. Run audit mode for two weeks to establish a baseline of agent browsing patterns. Use this data to define your firewall rules: which categories to allow, which to block, and which to route to human review.
After defining your rules, switch to enforcement mode on a single agent instance. Monitor the agent's workflow to ensure that legitimate navigation is not disrupted. Gradually roll out enforcement to additional agent instances over the following weeks. The phased deployment approach ensures that your firewall rules match actual agent behavior before you enforce them fleet-wide.
Build a category-aware firewall with the 102M domain database. Sub-millisecond filtering, 99.5% internet coverage, full audit logging.