Not every AI agent should have the same web access. A market research agent needs news and competitor sites. A customer support agent needs knowledge bases and documentation. A code assistant needs technical references. RBAC for agents maps each role to a specific set of allowed IAB categories, page types, and domain reputations — enforced by a 102 million domain database that resolves every URL before the agent navigates.
Most agent deployments give every agent the same unrestricted web access. This is the equivalent of giving every employee in your organization the same network permissions — a practice that enterprise security abandoned decades ago.
When every agent in your orchestration framework has identical web access permissions, your risk surface is defined by the most dangerous action any single agent could take. A data analysis agent that only needs to visit government statistics sites operates under the same permissions as a research agent that needs to browse competitor websites. If either agent can visit financial portals, adult content, or malware-hosting domains, then every agent in your system carries that risk — regardless of whether its task requires such access.
Role-Based Access Control for AI agents maps each agent role to a defined set of IAB categories, web filtering categories, and page types that the agent is permitted to access. A "Market Research Agent" role might be allowed to visit News, Business and Finance (excluding Financial Services), Technology, and Shopping categories. A "Customer Support Agent" role might be limited to Documentation, FAQ, Support, and Knowledge Base page types. A "Code Assistant Agent" might only access Technology, Computing, and Developer Documentation categories.
The enforcement mechanism is a 102 million domain database that resolves every URL to its IAB category, page type, and reputation score before the agent navigates. The agent harness evaluates the classification result against the agent's assigned role permissions and produces a deterministic allow/block/escalate decision. There is no model inference in the decision path — the policy evaluation is a simple set membership check that executes in microseconds.
Three components that transform flat agent permissions into granular, role-based web access control
Create named roles that reflect the functional purpose of each agent in your system. A "Research Agent" role, a "Support Agent" role, a "Code Agent" role, a "Content Agent" role. Each role carries metadata about its intended task scope, the departments that operate it, and the sensitivity level of the data it handles. Roles are the organizational primitive — every agent instance is assigned exactly one role at deployment time.
For each role, define which IAB categories, web filtering categories, and page types are allowed, blocked, or flagged for review. Use the 700+ IAB categories at any tier level — from broad Tier 1 blocks (block all "Adult") to surgical Tier 4 allows (allow "Technology > Computing > Cloud Computing > Infrastructure as a Service"). Layer page-type rules on top: even within allowed categories, block login, checkout, and admin page types for all non-privileged roles.
When an agent attempts to navigate to a URL, the harness extracts the domain, queries the 102M database for its classification, and evaluates the result against the agent's role permissions. The entire lookup-classify-evaluate pipeline executes in under 1ms for local database deployments. The decision is logged with the agent ID, role, target URL, classification result, and policy verdict — creating a complete audit trail for every navigation event across every agent in your system.
Production-ready snippets to enforce role-based web access control in your agent framework
import http.client
import json
from dataclasses import dataclass, field
@dataclass
class AgentRole:
"""Defines web access permissions for an agent role."""
name: str
allowed_categories: list = field(default_factory=list)
blocked_categories: list = field(default_factory=list)
allowed_page_types: list = field(default_factory=list)
blocked_page_types: list = field(default_factory=list)
max_reputation_risk: int = 5 # 0-10 scale
# Define roles with category permissions
ROLES = {
"research_agent": AgentRole(
name="Research Agent",
allowed_categories=[
"News", "Technology", "Business and Finance",
"Science", "Education"
],
blocked_categories=[
"Adult", "Illegal Content", "Gambling",
"Financial Services", "Banking"
],
blocked_page_types=[
"login", "checkout", "admin", "settings"
]
),
"support_agent": AgentRole(
name="Support Agent",
allowed_categories=[
"Technology", "Computers & Technology"
],
allowed_page_types=[
"documentation", "faq", "support",
"knowledge_base", "blog", "homepage"
],
blocked_page_types=[
"login", "checkout", "admin", "settings",
"payment", "dashboard"
]
),
"code_agent": AgentRole(
name="Code Assistant",
allowed_categories=[
"Technology", "Computers & Technology",
"Science", "Education"
],
blocked_categories=[
"Adult", "Gambling", "Shopping",
"Financial Services"
],
blocked_page_types=[
"login", "checkout", "admin", "payment"
]
)
}
class AgentRBACFilter:
def __init__(self, api_key):
self.api_key = api_key
self.conn = http.client.HTTPSConnection(
"www.websitecategorizationapi.com"
)
def classify(self, url):
payload = (
f"query={url}"
f"&api_key={self.api_key}"
f"&data_type=url"
f"&expanded_categories=1"
)
headers = {
"Content-Type": "application/x-www-form-urlencoded"
}
self.conn.request(
"POST",
"/api/iab/iab_web_content_filtering.php",
payload, headers
)
return json.loads(
self.conn.getresponse().read().decode("utf-8")
)
def evaluate(self, url, role_id):
role = ROLES.get(role_id)
if not role:
return {"allowed": False, "reason": "Unknown role"}
data = self.classify(url)
page_type = data.get("page_type", "unknown")
categories = [
c[0].split("Category name: ")[1]
for c in data.get("iab_classification", [])
]
# Check blocked page types
if page_type in role.blocked_page_types:
return {
"allowed": False,
"reason": f"Page type '{page_type}' blocked "
f"for role '{role.name}'"
}
# Check blocked categories
for cat in categories:
for blocked in role.blocked_categories:
if blocked.lower() in cat.lower():
return {
"allowed": False,
"reason": f"Category '{cat}' blocked "
f"for role '{role.name}'"
}
return {"allowed": True, "reason": "Role permits access"}
# Usage
rbac = AgentRBACFilter(api_key="your_api_key")
result = rbac.evaluate(
"https://docs.python.org/3/", "code_agent"
)
print(result) # {"allowed": True, ...}
const AGENT_ROLES = {
research: {
name: "Research Agent",
allowedCategories: [
"News", "Technology", "Science", "Education"
],
blockedPageTypes: [
"login", "checkout", "admin", "settings"
]
},
support: {
name: "Support Agent",
allowedCategories: ["Technology"],
allowedPageTypes: [
"documentation", "faq", "support", "blog"
],
blockedPageTypes: [
"login", "checkout", "admin", "payment"
]
}
};
async function rbacCheck(url, roleId, apiKey) {
const role = AGENT_ROLES[roleId];
if (!role) return { allowed: false, reason: "Unknown role" };
const res = await fetch(
"https://www.websitecategorizationapi.com" +
"/api/iab/iab_web_content_filtering.php",
{
method: "POST",
headers: {
"Content-Type": "application/x-www-form-urlencoded"
},
body: new URLSearchParams({
query: url,
api_key: apiKey,
data_type: "url",
expanded_categories: "1"
})
}
);
const data = await res.json();
const pageType = data.page_type || "unknown";
if (role.blockedPageTypes?.includes(pageType)) {
return {
allowed: false,
reason: `Page type '${pageType}' not permitted ` +
`for ${role.name}`
};
}
const categories = (data.iab_classification || [])
.map(c => c[0]?.replace("Category name: ", ""));
const hasAllowed = categories.some(cat =>
role.allowedCategories.some(a =>
cat.toLowerCase().includes(a.toLowerCase())
)
);
if (!hasAllowed && role.allowedCategories.length > 0) {
return {
allowed: false,
reason: `No matching category for ${role.name}`
};
}
return { allowed: true, reason: "Role permits access" };
}
Purpose-built domain databases for AI agent filtering. Includes IAB categories, 20+ page types, reputation scores, and popularity rankings. One-time purchase with perpetual license.
10 Million Domains with Page-Type Intelligence
One-time purchase: Perpetual license | Optional Updates: $1,599/year
20 Million Domains with Full Intelligence Suite
One-time purchase: Perpetual license | Optional Updates: $2,999/year
50 Million Domains with Complete Intelligence Suite
One-time purchase: Perpetual license | Optional Updates: $4,999/year
Also available: Enterprise URL Database up to 102M domains from $2,499. View all database tiers →
Search any IAB or Web Filtering category to see how many domains are in our 102M Enterprise Database — the same data your AI agent filtering rules will reference.
How 102 million domains from our main Enterprise Database are distributed across IAB v3 taxonomy classifications
Spanning Tier 1 through Tier 4 classifications from our 102M Enterprise Database
Charts display domain counts for the top 50 out of 700+ categories in our 102M Enterprise Database. To check the number of domains for the remaining 650+ categories, use the Category Counter tool above .
Role-Based Access Control is the most widely deployed authorization model in enterprise IT. Every corporate network, every cloud platform, and every SaaS application uses RBAC to ensure that users can only access resources appropriate to their job function. The principle is simple: instead of assigning permissions to individual users, you assign permissions to roles, and then assign users to roles. A "Finance Analyst" role can access the accounting system. A "Software Engineer" role can access the code repository. An "HR Manager" role can access the personnel database. No individual user accumulates permissions beyond what their role requires.
Applying this same model to AI agents is a natural extension, but one that almost no organization has implemented. Today, most agent deployments operate with a flat permission model: every agent can access every website. This is the equivalent of giving every employee in your company domain admin credentials — a practice so reckless that it would fail any security audit. The gap exists because the tooling to implement agent RBAC did not exist until pre-classified domain databases made it possible to map web categories to role permissions programmatically.
The first step in implementing agent RBAC is defining roles that reflect the actual business functions your agents perform. Avoid the temptation to create overly broad roles like "General Purpose Agent" — these defeat the purpose of RBAC by concentrating too many permissions in a single role. Instead, create narrowly scoped roles that align with specific agent tasks.
A well-designed role system might include: "Market Research Agent" (allowed: News, Business, Technology, Shopping; blocked: Adult, Gambling, Financial Services), "Customer Support Agent" (allowed: Documentation, FAQ, Support pages only within Technology category), "Legal Research Agent" (allowed: Legal, Government, News; blocked: everything else), "Content Creation Agent" (allowed: News, Arts, Entertainment, Education; blocked: Adult, Gambling, Malware), and "Security Monitoring Agent" (allowed: all categories with logging; blocked: none but every visit is audited). Each role carries a specific set of IAB category allows and blocks, page-type restrictions, and reputation thresholds.
Mature RBAC implementations support role hierarchies where child roles inherit permissions from parent roles. For agent web access, this means you can define a "Base Agent" role with universal blocks (Adult, Malware, Illegal Content, Phishing) and then create specialized roles that inherit these blocks while adding their own category-specific allows. The "Research Agent" role inherits the Base Agent blocks and adds News, Technology, and Business to its allow list. The "Support Agent" inherits the same blocks and adds Documentation and FAQ to its allow list.
This hierarchical approach simplifies policy management as your agent fleet grows. When a new threat category emerges — say, a new class of deepfake generation sites — you add it to the Base Agent block list once, and every child role automatically inherits the block. Without hierarchy, you would need to update the block list for every role individually, creating a maintenance burden that scales linearly with the number of roles.
The IAB Content Taxonomy v3 provides four tiers of category granularity, and the choice of tier level significantly affects the precision of your RBAC policies. Tier 1 categories like "Technology & Computing" are broad — blocking this category would prevent a code assistant agent from visiting any technology website, which is almost certainly too restrictive. Tier 4 categories like "Technology & Computing > Computing > Cloud Computing > Infrastructure as a Service" are extremely specific — allowing this category gives the agent access to AWS, Azure, and GCP documentation while blocking consumer electronics sites.
The recommended approach is to use mixed-tier permissions within each role. Apply Tier 1 blocks for categories that are universally dangerous (Adult, Illegal Content). Apply Tier 2 allows for the primary domain of the agent's work (e.g., "Technology & Computing > Computing" for a code agent). Apply Tier 3 or Tier 4 blocks for edge cases within otherwise allowed categories (e.g., block "Technology & Computing > Consumer Electronics > Smartphones" for an infrastructure-focused agent). This multi-tier approach gives you the precision of fine-grained permissions with the safety of broad categorical blocks.
Page-type permissions operate orthogonally to category permissions. A "login" page type is dangerous regardless of whether it appears on a Technology site, a News site, or a Shopping site. Similarly, an "admin" page type should be blocked for all agent roles except a specially designated "IT Administration Agent" role (if such a role is even appropriate). This makes page-type permissions a cross-cutting concern that applies across all categories.
In the RBAC model, define a set of universally blocked page types (login, checkout, admin, settings, payment, dashboard) that apply to every role. Then allow specific roles to override individual page types when their function requires it. For example, a "QA Testing Agent" might need access to login pages to verify authentication flows — in this case, the QA role explicitly overrides the universal login block. The override is logged and audited, maintaining visibility into the exception.
The principle of least privilege states that any entity should have only the minimum permissions necessary to perform its function. For AI agents, this means starting with zero web access permissions and adding only the specific categories, page types, and domain reputation ranges that the agent's task requires. This is the opposite of the common approach, which starts with full access and adds blocks — an approach that inevitably leaves gaps.
To implement least privilege, begin by documenting the specific web resources each agent type needs. A market research agent needs to visit competitor websites (Shopping, Business), read industry news (News), and check social media sentiment (Social Media). It does not need to visit banking sites, HR platforms, gaming sites, or adult content. Map these requirements to IAB categories, and you have a minimal permission set. Any domain that falls outside these categories is automatically denied — no explicit block list needed.
Every RBAC decision must be logged for compliance and forensic purposes. The audit log should capture: the agent instance ID, the assigned role, the target URL, the database classification result (IAB category, page type, reputation score), the RBAC policy verdict (allow, block, or escalate), the policy rule that triggered the verdict, and the timestamp. This log serves two purposes: real-time monitoring for anomalous agent behavior, and historical audit trail for compliance reporting.
Structured audit logs also enable role optimization over time. By analyzing which categories each role actually accesses — versus which categories the role is permitted to access — you can identify overly permissive roles and tighten their permissions. If a "Research Agent" is permitted to visit Shopping sites but never actually does, you can remove Shopping from its allow list, further reducing the risk surface.
Organizations that operate agents on behalf of multiple clients — managed service providers, SaaS platforms, agent orchestration vendors — need multi-tenant RBAC where each client can define their own role permissions independently. Client A may allow their research agents to visit social media sites while Client B blocks social media entirely. The RBAC system must support per-tenant role definitions while maintaining a global set of safety blocks (Malware, Phishing, Illegal Content) that no tenant can override.
The 102M domain database supports multi-tenant RBAC naturally. Each tenant defines their own role-to-category mappings, and the shared database provides the classification data that all tenants reference. The database is a read-only resource — no tenant can modify the classifications — which ensures consistent policy enforcement across tenants. The per-tenant role definitions are stored separately, typically in the agent orchestration platform's configuration store, and evaluated against the shared database at navigation time.
Give every agent role precisely the web access it needs — and nothing more. One-time purchase, perpetual license, 102 million domains classified and ready for role-based enforcement.