WebsiteCategorizationAPI
Home
Demo Tools - Categorization
Website Categorization Text Classification URL Database Taxonomy Mapper
Demo Tools - Website Intel
Technology Detector Quality Score Competitor Finder
Demo Tools - Brand Safety
Brand Safety Checker Brand Suitability Quality Checker
Demo Tools - Content
Sentiment Analyzer Context Aware Ads
Resources
API Documentation Pricing Login
Try Categorization

Page-Type Classification That Powers Agent Navigation Policy

Knowing that a domain belongs to "Business and Finance" tells you what the site is about. Knowing that the specific page your agent is about to visit is a login page, a checkout flow, or an admin panel tells you what the agent could do there. Page-type classification adds a critical dimension to agent policy — enabling rules like "allow finance sites but block their login pages" that IAB categories alone cannot express.

20+
Page Types
102M
Classified Domains
700+
IAB Categories
Granular
Policy Control

The Problem: Category-Only Filtering Is Too Coarse

Blocking an entire category blocks thousands of useful pages along with the few dangerous ones.

Categories Tell You What — Not Where or How

A domain classified as "Business and Finance > Banking" hosts dozens of page types: marketing pages, product descriptions, rate calculators, customer support FAQ, branch locator maps, and — critically — login portals, account dashboards, fund transfer interfaces, and admin panels. Category-level filtering treats all of these pages identically. If you allow the "Banking" category, you allow login pages. If you block it, you lose access to publicly available rate information, branch locations, and financial product comparisons that your agent legitimately needs for research.

  • Login pages: Agents navigating to authentication screens may attempt to log in, triggering security alerts or account lockouts at the target organization
  • Checkout pages: An agent that reaches a payment flow might submit form data, initiating unwanted transactions or exposing internal financial information
  • Admin panels: Administrative interfaces discovered via crawling represent high-value targets — an agent interacting with one is a severe security incident
  • Settings pages: Account settings pages could allow an agent to modify configurations, change passwords, or alter security settings

The Solution: Page-Type Labels Enable Surgical Policy Rules

Page-type classification adds a second axis to your agent policy. Instead of "allow or block this category," you can write rules like "allow Business and Finance domains except login, checkout, and admin page types." This surgical precision means your agent can research banking products, compare interest rates, and read financial news — while being blocked from authentication portals, payment flows, and administrative interfaces on those same domains.

Our database classifies pages into 20+ distinct types: homepage, about, contact, pricing, careers, login, signup, checkout, settings, admin, account, password_reset, legal, privacy_policy, terms_of_service, blog, documentation, api_reference, support, faq, forum, and product pages. Each type maps to a specific risk level and a recommended policy action, giving your policy engine the granularity it needs for production agent deployments.

Page-Type Classification Matrix

20+ distinct page types mapped to policy actions

Page Types Mapped to Agent Policy Actions

Three risk tiers that organize page types into clear policy categories

High Risk: Always Block

Page types that represent interactive surfaces where agent action could cause harm. Login and signup pages involve authentication — an agent may attempt to enter credentials. Checkout and payment pages involve financial transactions. Admin and settings pages provide control over system configuration. Password reset pages could trigger security workflows at the target organization. These page types should be hard-blocked for all agents regardless of category scope.

Medium Risk: Log and Monitor

Page types that are generally safe for reading but may contain sensitive information. Account pages display personal data. Contact pages contain organizational information that could be used for social engineering. Careers pages reveal organizational structure. Legal, privacy policy, and terms of service pages contain binding language. These types are allowed but logged with enhanced detail for audit purposes.

Low Risk: Allow Freely

Page types designed for public consumption and information sharing. Homepage, about, blog, documentation, api_reference, support, faq, forum, product, and pricing pages are built for visitors — including automated ones. These types are allowed with standard logging. They represent the vast majority of pages an agent will encounter during legitimate research tasks.

Risk-Tiered Page Type Sorting

Pages sorted into block, monitor, and allow tiers in real time

Page-Type Policy Integration Code

Implement granular page-type rules in your agent's navigation pipeline

Python — Page-Type Aware Agent Policy Engine

import http.client import json class PageTypePolicyEngine: """Policy engine that combines IAB categories with page-type labels for granular agent navigation rules.""" # Page types grouped by risk tier BLOCK_TYPES = { "login", "signup", "checkout", "admin", "settings", "password_reset", "account" } MONITOR_TYPES = { "contact", "careers", "legal", "privacy_policy", "terms_of_service" } ALLOW_TYPES = { "homepage", "about", "blog", "documentation", "api_reference", "support", "faq", "forum", "product", "pricing" } def __init__(self, api_key): self.api_key = api_key self.conn = http.client.HTTPSConnection( "www.websitecategorizationapi.com" ) def classify(self, url): domain = url.split("//")[-1].split("/")[0] payload = ( f"query={url}" f"&api_key={self.api_key}" f"&data_type=url" f"&expanded_categories=1" ) headers = { "Content-Type": "application/x-www-form-urlencoded" } self.conn.request( "POST", "/api/iab/iab_web_content_filtering.php", payload, headers ) return json.loads( self.conn.getresponse().read().decode("utf-8") ) def evaluate(self, url): """Two-dimensional policy evaluation: category scope + page type risk tier.""" data = self.classify(url) page_type = data.get("page_type", "unknown") categories = [ c[0].split("Category name: ")[1] for c in data.get("iab_classification", []) ] # Page-type takes priority over category if page_type in self.BLOCK_TYPES: return { "action": "block", "reason": f"High-risk page type: " f"{page_type}", "page_type": page_type, "categories": categories, "risk_tier": "high" } if page_type in self.MONITOR_TYPES: return { "action": "allow_monitored", "reason": f"Medium-risk page type: " f"{page_type}", "page_type": page_type, "categories": categories, "risk_tier": "medium" } return { "action": "allow", "reason": f"Safe page type: {page_type}", "page_type": page_type, "categories": categories, "risk_tier": "low" } # Usage engine = PageTypePolicyEngine(api_key="your_api_key") urls = [ "https://bank.com/personal/savings", "https://bank.com/login", "https://bank.com/admin/dashboard", "https://techblog.com/articles/ai-trends", ] for url in urls: result = engine.evaluate(url) print( f"[{result['action'].upper()}] {url} " f"(type={result['page_type']}, " f"risk={result['risk_tier']})" )

JavaScript — Page-Type Policy Guard

class PageTypePolicyGuard { constructor(apiKey) { this.apiKey = apiKey; this.blockTypes = new Set([ "login", "signup", "checkout", "admin", "settings", "password_reset" ]); this.monitorTypes = new Set([ "contact", "careers", "legal", "privacy_policy", "terms_of_service" ]); } async classifyAndEvaluate(targetURL) { const resp = await fetch( "https://www.websitecategorizationapi.com" + "/api/iab/iab_web_content_filtering.php", { method: "POST", headers: { "Content-Type": "application/x-www-form-urlencoded" }, body: new URLSearchParams({ query: targetURL, api_key: this.apiKey, data_type: "url", expanded_categories: "1" }) } ); const data = await resp.json(); const pageType = data.page_type || "unknown"; if (this.blockTypes.has(pageType)) { return { allowed: false, action: "block", pageType: pageType, reason: `High-risk: ${pageType} page` }; } const monitored = this.monitorTypes.has(pageType); return { allowed: true, action: monitored ? "allow_monitored" : "allow", pageType: pageType, reason: monitored ? `Monitored: ${pageType} page` : `Safe: ${pageType} page` }; } } // Guard agent navigation with page-type awareness const guard = new PageTypePolicyGuard("your_key"); const decision = await guard.classifyAndEvaluate( "https://saas-app.com/settings/billing" ); console.log(decision); // { allowed: false, action: "block", // pageType: "settings", // reason: "High-risk: settings page" }

Two-Dimensional Policy Evaluation

Category scope + page type risk = precise navigation decisions

AI Agent Database Pricing

Purpose-built domain databases for AI agent filtering. Includes IAB categories, 20+ page types, reputation scores, and popularity rankings. One-time purchase with perpetual license.

AI Agent Database
AI Agent Domain Database 10M
$7,999

10 Million Domains with Page-Type Intelligence

One-time purchase: Perpetual license  |  Optional Updates: $1,599/year

  • 10M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global Popularity Rankings
Popular
AI Agent Domain Database 20M
$14,999

20 Million Domains with Full Intelligence Suite

One-time purchase: Perpetual license  |  Optional Updates: $2,999/year

  • 20M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global & Country Rankings
  • Dedicated Account Manager
Maximum Coverage
AI Agent Domain Database 50M
$24,999

50 Million Domains with Complete Intelligence Suite

One-time purchase: Perpetual license  |  Optional Updates: $4,999/year

  • 50M+ Categorized Domains
  • IAB Taxonomies v2 & v3
  • 20+ Page Type Labels
  • Web Filtering Categories
  • OpenPageRank Scores
  • Global & Country Rankings
  • Dedicated Account Manager

Also available: Enterprise URL Database up to 102M domains from $2,499. View all database tiers →

How Many Domains in Each Category?

Search any IAB or Web Filtering category to see how many domains are in our 102M Enterprise Database — combined with page-type labels, these categories power your agent policy rules.

Popular:
Database Analytics

Domain Distribution by Category in Our 102M Enterprise Database

How 102 million domains from our main Enterprise Database are distributed across IAB v3 taxonomy classifications

Top 50 IAB v3 Categories

Spanning Tier 1 through Tier 4 classifications from our 102M Enterprise Database

IAB v3

Charts display domain counts for the top 50 out of 700+ categories in our 102M Enterprise Database. To check the number of domains for the remaining 650+ categories, use the Category Counter tool above .

Page-Type Detection Across Domains

Every domain's pages classified into 20+ distinct functional types

Why Page-Type Classification Is the Missing Piece in Agent Governance

Most conversations about AI agent web governance focus on domain categories — blocking adult sites, restricting access to malware domains, limiting agents to specific industry verticals. These category-level controls are necessary but insufficient. The real risk in agent navigation is not the domain; it is the page. An agent visiting bloomberg.com is fine. An agent visiting bloomberg.com/login is a problem. An agent visiting your-internal-app.com/admin is a crisis. Page-type classification bridges this gap by labeling each page with its functional purpose, enabling policies that operate at the page level rather than the domain level.

Consider a practical scenario: your financial research agent needs to access banking websites to compare interest rates, analyze product offerings, and track market trends. Without page-type classification, you have two choices — allow the entire "Banking" category (and accept that the agent might hit login pages) or block it (and lose legitimate research capabilities). With page-type classification, you write a single rule: "allow Banking category, block page types login, checkout, admin, settings." The agent can read publicly available product pages, blog posts, and rate tables while being prevented from interacting with authentication flows, payment systems, and administrative interfaces.

The 20+ Page Types in Our Classification System

Our database classifies pages into the following functional types, each representing a distinct interaction pattern and risk profile. Homepage: the main landing page of a domain, designed for public visitors. About: organizational information pages. Contact: pages with contact forms, email addresses, and physical addresses. Pricing: pages displaying product or service pricing. Careers: job listing and recruitment pages. Login: authentication pages requiring username and password entry. Signup: account registration pages. Checkout: payment and transaction pages. Settings: account or system configuration pages. Admin: administrative control panels. Account: user account dashboard pages. Password reset: credential recovery pages. Legal: legal notice and disclaimer pages. Privacy policy: data protection and privacy pages. Terms of service: contractual terms pages. Blog: article and content pages. Documentation: technical documentation pages. API reference: API endpoint documentation. Support: customer support and help pages. FAQ: frequently asked questions. Forum: community discussion pages. Product: product description and listing pages.

Combining Page Types with IAB Categories for Two-Dimensional Policy

The most powerful agent policies combine both dimensions: category scope and page-type restrictions. This two-dimensional model creates a policy matrix where each cell represents a specific combination of "what the site is about" and "what the page does." A financial research agent might have a policy matrix like this: Technology + any page type = allow. Finance + blog/documentation/pricing/product = allow. Finance + login/checkout/admin = block. Healthcare + any page type = block (outside scope). Adult + any page type = block (global rule). This matrix is easy to define, easy to audit, and easy to explain to compliance reviewers.

Login Page Detection: The Highest-Priority Use Case

Login pages are the most critical page type to detect and block for AI agents. When an agent encounters a login page, several dangerous scenarios can unfold. If the agent has access to stored credentials (through environment variables, secret managers, or configuration files), it may attempt to authenticate — potentially violating the target service's terms of use, triggering rate limiters, or locking legitimate accounts. Even without credentials, the agent may generate plausible-looking credential pairs from its training data and submit them, creating failed login events that trigger security alerts at the target organization.

Our login page detection identifies authentication pages by analyzing URL patterns, page structure, and form field indicators. The detection covers standard login pages (/login, /signin, /auth), SSO portals, OAuth flows, and custom authentication implementations. When the database labels a page as "login," your agent's middleware can block the navigation before any data is sent to the target server.

Checkout Page Protection: Preventing Unwanted Transactions

Checkout pages represent financial risk. An agent that reaches a checkout page might fill form fields with data from its context, potentially initiating purchases, entering credit card numbers, or submitting billing information. Even if the agent does not have access to real payment data, interacting with checkout flows can create partial orders, abandoned carts that trigger marketing emails, or fraud alerts at payment processors. Page-type classification identifies checkout pages (/checkout, /cart, /payment, /billing) and blocks agent navigation before the risk materializes.

Admin Page Protection: The Worst-Case Scenario

Administrative pages are the highest-risk interaction surface on the web. An agent that navigates to an admin panel — whether through a crawled link, a misconfigured redirect, or an adversarial prompt injection — could potentially modify system settings, access sensitive data, create or delete user accounts, or alter security configurations. Even viewing an admin page exposes organizational structure and system architecture information. Our page-type classification detects admin pages (/admin, /dashboard, /panel, /manage, /console) and enables hard-block policies that prevent any agent from reaching administrative interfaces, regardless of the domain's category.

Real-Time API for Dynamic Page-Type Classification

For pages not covered by the static database — dynamically generated URLs, single-page applications with hash routes, or newly created pages — the real-time API provides on-demand page-type classification. Submit any URL to the API and receive its page-type label within 200 milliseconds. The API analyzes URL structure, path patterns, query parameters, and known page-type indicators to determine the page's functional purpose. Use the API as a fallback for the local database, ensuring that even unknown URLs receive page-type classification before the agent navigates.

Designing Policies with Page-Type Awareness

Start with a baseline policy that blocks all high-risk page types globally: login, signup, checkout, admin, settings, password_reset, and account. This baseline protects against the most dangerous interactions regardless of domain category. Then layer category-specific rules on top: allow your agent's target categories (Technology, Finance, News) while keeping the page-type blocks in place. Finally, add monitoring rules for medium-risk page types (contact, careers, legal) that log enhanced detail without blocking. This layered approach — global blocks, category scopes, monitoring rules — creates a comprehensive governance framework that is both protective and permissive enough for productive agent operation.

Policy Matrix Visualization

Category scope x page type risk = precise agent governance

Power Your Agent Policies with Page-Type Intelligence

Deploy page-type classification across 102 million domains. 20+ page types, IAB taxonomy, reputation scores. One-time purchase, perpetual license.

View AI Agent Database View 102M Enterprise Database
Stay in the loop

You are on the list!

We will send you updates that matter — no spam.