Blocklists tell agents where they cannot go. Allowlists tell agents where they can go — and that distinction is the difference between reactive security and proactive governance. Build a category-based allowlist service powered by our 102 million domain database that defines the exact web perimeter your enterprise browser agents are permitted to operate within.
Blocklists attempt to enumerate every dangerous destination on the internet. By definition, they can only block threats they already know about — leaving your agents exposed to everything else.
A blocklist-only approach assumes that the internet is safe by default and that you only need to enumerate the bad destinations. For human browsing, where users exercise judgment before clicking, this assumption is barely tolerable. For autonomous AI agents that follow link chains without human oversight, it is fundamentally broken. The internet has over 1 billion registered domains. A blocklist that covers 10 million known-bad domains still leaves 990 million uncategorized destinations where your agent can roam freely.
Instead of trying to enumerate every bad domain, define the categories of domains your agent is allowed to visit. A financial research agent gets access to IAB categories "Business and Finance," "News," and "Technology & Computing" — and nothing else. A product research agent gets "Shopping," "Technology & Computing," and "Business and Finance." Every other category is blocked by default, not because it is explicitly dangerous, but because it is outside the agent's approved scope.
Our 102 million domain database provides the category intelligence that makes this approach practical. Every domain is pre-classified with IAB v3 taxonomy categories, web filtering labels, page types, and reputation scores. Your allowlist service queries this data in microseconds and returns a binary decision: the domain's category is in the allowlist, or it is not. No ambiguity, no model inference, no probabilistic guessing.
Three architectural patterns for building an allowlist service that scales with your agent fleet
Define your allowlist as a set of IAB categories, web filtering categories, and page types. Instead of maintaining a list of 50,000 individual domains, you maintain a list of 15-20 category identifiers. The 102M database resolves every domain to its categories, and the allowlist service checks membership in microseconds.
Different agents have different scopes. A financial analyst agent needs access to financial data sites. A marketing agent needs access to advertising and media platforms. Create role-based allowlist profiles that map agent types to approved category sets, ensuring each agent operates within its designated perimeter.
The allowlist service operates on a default-deny model. If a domain's category is not explicitly in the agent's allowlist, the navigation is blocked. This inverts the traditional security model — instead of assuming the internet is safe and blocking known threats, you assume the internet is untrusted and only permit known-good categories.
Production-ready snippets to build a category-based allowlist for your agent deployments
import http.client
import json
class AllowlistService:
"""Enforces category-based allowlisting for enterprise
browser agents using the 102M domain database."""
def __init__(self, api_key, allowed_categories, allowed_page_types=None):
self.api_key = api_key
self.allowed_categories = set(
c.lower() for c in allowed_categories
)
self.allowed_page_types = set(
p.lower() for p in (allowed_page_types or [])
)
self.conn = http.client.HTTPSConnection(
"www.websitecategorizationapi.com"
)
def resolve_category(self, target_url):
payload = (
f"query={target_url}"
f"&api_key={self.api_key}"
f"&data_type=url"
f"&expanded_categories=1"
)
headers = {
"Content-Type": "application/x-www-form-urlencoded"
}
self.conn.request(
"POST",
"/api/iab/iab_web_content_filtering.php",
payload,
headers
)
res = self.conn.getresponse()
return json.loads(res.read().decode("utf-8"))
def is_allowed(self, target_url):
data = self.resolve_category(target_url)
categories = [
c[0].split("Category name: ")[1].lower()
for c in data.get("iab_classification", [])
]
page_type = data.get("page_type", "unknown").lower()
# Default-deny: must match an allowed category
cat_match = any(
allowed in cat
for cat in categories
for allowed in self.allowed_categories
)
if not cat_match:
return False, f"Category not in allowlist: {categories}"
# Block restricted page types even if category matches
if page_type in {"login", "checkout", "admin", "settings"}:
return False, f"Blocked page type: {page_type}"
return True, "Domain is within allowlisted categories"
# Financial research agent — narrow scope
fin_allowlist = AllowlistService(
api_key="your_api_key",
allowed_categories=[
"Business and Finance",
"News",
"Technology"
]
)
ok, reason = fin_allowlist.is_allowed("https://reuters.com")
print(f"Allowed: {ok} — {reason}")
class AllowlistGateway {
constructor(apiKey, allowedCategories) {
this.apiKey = apiKey;
this.allowedSet = new Set(
allowedCategories.map(c => c.toLowerCase())
);
this.cache = new Map();
}
async classifyDomain(targetURL) {
const domain = new URL(targetURL).hostname;
if (this.cache.has(domain)) return this.cache.get(domain);
const response = await fetch(
"https://www.websitecategorizationapi.com" +
"/api/iab/iab_web_content_filtering.php",
{
method: "POST",
headers: {
"Content-Type": "application/x-www-form-urlencoded"
},
body: new URLSearchParams({
query: targetURL,
api_key: this.apiKey,
data_type: "url",
expanded_categories: "1"
})
}
);
const result = await response.json();
this.cache.set(domain, result);
return result;
}
async checkAllowlist(targetURL) {
const data = await this.classifyDomain(targetURL);
const filterCat =
data.filtering_taxonomy?.[0]?.[0]
?.replace("Category name: ", "")
?.toLowerCase() || "unknown";
const isAllowed = this.allowedSet.has(filterCat);
return {
url: targetURL,
category: filterCat,
allowed: isAllowed,
action: isAllowed ? "allow" : "block",
reason: isAllowed
? "Category in allowlist"
: "Category not in allowlist"
};
}
}
Purpose-built domain databases for AI agent filtering. Includes IAB categories, 20+ page types, reputation scores, and popularity rankings. One-time purchase with perpetual license.
10 Million Domains with Page-Type Intelligence
One-time purchase: Perpetual license | Optional Updates: $1,599/year
20 Million Domains with Full Intelligence Suite
One-time purchase: Perpetual license | Optional Updates: $2,999/year
50 Million Domains with Complete Intelligence Suite
One-time purchase: Perpetual license | Optional Updates: $4,999/year
Also available: Enterprise URL Database up to 102M domains from $2,499. View all database tiers →
Search any IAB or Web Filtering category to see how many domains are in our 102M Enterprise Database — the same data powering your allowlist service decisions.
How 102 million domains from our main Enterprise Database are distributed across IAB v3 taxonomy classifications
Spanning Tier 1 through Tier 4 classifications from our 102M Enterprise Database
Charts display domain counts for the top 50 out of 700+ categories in our 102M Enterprise Database. To check the number of domains for the remaining 650+ categories, use the Category Counter tool above .
Enterprise security has always operated on one of two philosophical models: default-allow with blocklists, or default-deny with allowlists. For decades, web proxies and firewalls used a default-allow model because the alternative — explicitly approving every website an employee might need — was operationally impractical. Employees browse unpredictably, and no IT team could maintain an allowlist that kept pace with human browsing behavior.
AI agents change this calculus entirely. Unlike human employees, agents operate within defined task scopes. A financial research agent does not need to browse social media. A marketing analytics agent does not need to access healthcare portals. A code review agent does not need to visit e-commerce sites. Because agent scopes are defined in advance, allowlists become operationally practical — you know exactly which categories of sites each agent needs to access, and you can define those categories before the agent launches.
Traditional allowlists operate at the domain level: explicitly enumerate every approved domain. This approach works for small-scale deployments but collapses at enterprise scale. A financial research agent might need to access tens of thousands of financial news sites, data providers, regulatory filings, and company websites. Manually maintaining a domain-level allowlist of that size is a full-time job for multiple analysts.
Category-level allowlisting eliminates this maintenance burden. Instead of listing 50,000 individual financial domains, you add the IAB categories "Business and Finance," "Financial Services," and "News" to the allowlist. The 102M domain database resolves every domain in those categories automatically. When a new financial news site launches, it gets categorized in the database and is instantly accessible to your agent — no manual allowlist update required.
Enterprise deployments typically run multiple agent types, each with a different task scope. A well-designed allowlist service supports role-based profiles that map each agent type to its approved category set. Consider a typical enterprise deployment with four agent roles: financial analyst, marketing researcher, HR recruiter, and IT support. Each role maps to a distinct set of approved categories.
The financial analyst agent gets "Business and Finance," "News," "Legal," and "Government." The marketing researcher gets "Advertising," "Marketing," "Social Networking," "News," and "Technology." The HR recruiter gets "Careers," "Education," "Social Networking," and "Business." The IT support agent gets "Technology & Computing," "Software," "Computers & Electronics," and "Information Security." Each profile is defined once and applied to every agent instance of that role.
Not every allowlist decision is binary. Some domains fall into categories that are partially within scope — for example, a general news site that occasionally publishes financial content. Rather than blocking these domains outright, the allowlist service can route them to a review queue where a human analyst or a secondary validation layer evaluates whether the specific page (not just the domain) is within scope.
Page-type intelligence from the 102M database enables this nuanced handling. A domain might be categorized as "News" (allowed) but the specific page the agent wants to visit is a "login" page type (blocked regardless of category). The allowlist service checks both the category allowlist and the page-type blocklist, ensuring that even within approved categories, sensitive page types are protected.
Allowlists can drift over time as business requirements change, new agent roles are added, and organizational policies evolve. A well-designed allowlist service includes continuous validation: periodically auditing which categories each agent type actually accesses versus which categories are in its allowlist. This analysis identifies over-permissioned profiles (agents with access to categories they never use) and under-permissioned profiles (agents that frequently hit blocked categories because their allowlist is too narrow).
The audit data from the allowlist service feeds directly into this validation process. Every allow and block decision is logged with the timestamp, agent instance, target domain, resolved category, and decision outcome. Aggregating these logs by agent role and category reveals the actual usage patterns that should inform allowlist refinement.
Financial services regulators, healthcare compliance frameworks, and government security standards increasingly require organizations to demonstrate control over AI agent web access. An allowlist-based governance model provides the documentation these regulators need: a defined set of approved categories, a deterministic decision mechanism, and a complete audit trail of every navigation event and its disposition.
Compare this to a blocklist-based model, where the compliance evidence is a list of known-bad domains and a hope that the agent did not visit something worse. When a regulator asks "how do you ensure your AI agents only access appropriate websites?", an allowlist answer is definitive: "Our agents can only access domains in these specific IAB categories, and here is the audit log proving it." A blocklist answer is defensive: "We tried to block the bad stuff, and we think we got most of it."
The 102M database loads into a Redis instance on a machine with 32GB of RAM. Each domain-to-category lookup completes in under 1 millisecond. The allowlist check — comparing the returned category against the approved set — adds negligible overhead. End-to-end, from the agent's navigation intent to the allow/block decision, the total latency is under 2 milliseconds for cached domains.
For domains not in the local database, the real-time API provides on-demand classification with an average response time of 200 milliseconds. This fallback is invoked for less than 0.5% of agent navigation requests, keeping the overall performance impact minimal even for agents that encounter unusual domains.
Some organizations consider building their own domain categorization engine to power their allowlist service. This approach requires training data (millions of labeled domains), ML model infrastructure, continuous re-training pipelines, and a team to maintain accuracy over time. The total cost of ownership typically exceeds $500,000 per year for a production-grade system — and the resulting database covers a fraction of the 102M domains in our pre-built offering.
Buying the database eliminates this entire build-and-maintain cycle. You receive 102 million pre-classified domains with IAB categories, page types, reputation scores, and popularity rankings — ready to deploy as your allowlist data source within hours, not months. The one-time purchase model means no ongoing subscription fees for the base data, and optional annual updates keep the data current.
Deploy category-based allowlisting with the 102M domain database. One-time purchase, perpetual license, default-deny security for every agent in your fleet.