Page-Type Classification That Powers Agent Navigation Policy

The Problem: Category-Only Filtering Is Too Coarse

Blocking an entire category blocks thousands of useful pages along with the few dangerous ones.

Categories Tell You What — Not Where or How

A domain classified as "Business and Finance > Banking" hosts dozens of page types: marketing pages, product descriptions, rate calculators, customer support FAQ, branch locator maps, and — critically — login portals, account dashboards, fund transfer interfaces, and admin panels. Category-level filtering treats all of these pages identically. If you allow the "Banking" category, you allow login pages. If you block it, you lose access to publicly available rate information, branch locations, and financial product comparisons that your agent legitimately needs for research.

Login pages: Agents navigating to authentication screens may attempt to log in, triggering security alerts or account lockouts at the target organization
Checkout pages: An agent that reaches a payment flow might submit form data, initiating unwanted transactions or exposing internal financial information
Admin panels: Administrative interfaces discovered via crawling represent high-value targets — an agent interacting with one is a severe security incident
Settings pages: Account settings pages could allow an agent to modify configurations, change passwords, or alter security settings

The Solution: Page-Type Labels Enable Surgical Policy Rules

Page-type classification adds a second axis to your agent policy. Instead of "allow or block this category," you can write rules like "allow Business and Finance domains except login, checkout, and admin page types." This surgical precision means your agent can research banking products, compare interest rates, and read financial news — while being blocked from authentication portals, payment flows, and administrative interfaces on those same domains.

Our database classifies pages into 20+ distinct types: homepage, about, contact, pricing, careers, login, signup, checkout, settings, admin, account, password_reset, legal, privacy_policy, terms_of_service, blog, documentation, api_reference, support, faq, forum, and product pages. Each type maps to a specific risk level and a recommended policy action, giving your policy engine the granularity it needs for production agent deployments.

Page Types Mapped to Agent Policy Actions

Three risk tiers that organize page types into clear policy categories

High Risk: Always Block

Page types that represent interactive surfaces where agent action could cause harm. Login and signup pages involve authentication — an agent may attempt to enter credentials. Checkout and payment pages involve financial transactions. Admin and settings pages provide control over system configuration. Password reset pages could trigger security workflows at the target organization. These page types should be hard-blocked for all agents regardless of category scope.

Medium Risk: Log and Monitor

Page types that are generally safe for reading but may contain sensitive information. Account pages display personal data. Contact pages contain organizational information that could be used for social engineering. Careers pages reveal organizational structure. Legal, privacy policy, and terms of service pages contain binding language. These types are allowed but logged with enhanced detail for audit purposes.

Low Risk: Allow Freely

Page types designed for public consumption and information sharing. Homepage, about, blog, documentation, api_reference, support, faq, forum, product, and pricing pages are built for visitors — including automated ones. These types are allowed with standard logging. They represent the vast majority of pages an agent will encounter during legitimate research tasks.

Page-Type Policy Integration Code

Implement granular page-type rules in your agent's navigation pipeline

Python — Page-Type Aware Agent Policy Engine

import http.client
import json

class PageTypePolicyEngine:
    """Policy engine that combines IAB categories with
    page-type labels for granular agent navigation rules."""

    # Page types grouped by risk tier
    BLOCK_TYPES = {
        "login", "signup", "checkout", "admin",
        "settings", "password_reset", "account"
    }
    MONITOR_TYPES = {
        "contact", "careers", "legal",
        "privacy_policy", "terms_of_service"
    }
    ALLOW_TYPES = {
        "homepage", "about", "blog", "documentation",
        "api_reference", "support", "faq", "forum",
        "product", "pricing"
    }

    def __init__(self, api_key):
        self.api_key = api_key
        self.conn = http.client.HTTPSConnection(
            "www.websitecategorizationapi.com"
        )

    def classify(self, url):
        domain = url.split("//")[-1].split("/")[0]
        payload = (
            f"query={url}"
            f"&api_key={self.api_key}"
            f"&data_type=url"
            f"&expanded_categories=1"
        )
        headers = {
            "Content-Type": "application/x-www-form-urlencoded"
        }
        self.conn.request(
            "POST",
            "/api/iab/iab_web_content_filtering.php",
            payload, headers
        )
        return json.loads(
            self.conn.getresponse().read().decode("utf-8")
        )

    def evaluate(self, url):
        """Two-dimensional policy evaluation:
        category scope + page type risk tier."""
        data = self.classify(url)
        page_type = data.get("page_type", "unknown")
        categories = [
            c[0].split("Category name: ")[1]
            for c in data.get("iab_classification", [])
        ]

        # Page-type takes priority over category
        if page_type in self.BLOCK_TYPES:
            return {
                "action": "block",
                "reason": f"High-risk page type: "
                          f"{page_type}",
                "page_type": page_type,
                "categories": categories,
                "risk_tier": "high"
            }

        if page_type in self.MONITOR_TYPES:
            return {
                "action": "allow_monitored",
                "reason": f"Medium-risk page type: "
                          f"{page_type}",
                "page_type": page_type,
                "categories": categories,
                "risk_tier": "medium"
            }

        return {
            "action": "allow",
            "reason": f"Safe page type: {page_type}",
            "page_type": page_type,
            "categories": categories,
            "risk_tier": "low"
        }

# Usage
engine = PageTypePolicyEngine(api_key="your_api_key")

urls = [
    "https://bank.com/personal/savings",
    "https://bank.com/login",
    "https://bank.com/admin/dashboard",
    "https://techblog.com/articles/ai-trends",
]

for url in urls:
    result = engine.evaluate(url)
    print(
        f"[{result['action'].upper()}] {url} "
        f"(type={result['page_type']}, "
        f"risk={result['risk_tier']})"
    )

JavaScript — Page-Type Policy Guard

class PageTypePolicyGuard {
  constructor(apiKey) {
    this.apiKey = apiKey;
    this.blockTypes = new Set([
      "login", "signup", "checkout",
      "admin", "settings", "password_reset"
    ]);
    this.monitorTypes = new Set([
      "contact", "careers", "legal",
      "privacy_policy", "terms_of_service"
    ]);
  }

  async classifyAndEvaluate(targetURL) {
    const resp = await fetch(
      "https://www.websitecategorizationapi.com" +
      "/api/iab/iab_web_content_filtering.php",
      {
        method: "POST",
        headers: {
          "Content-Type":
            "application/x-www-form-urlencoded"
        },
        body: new URLSearchParams({
          query: targetURL,
          api_key: this.apiKey,
          data_type: "url",
          expanded_categories: "1"
        })
      }
    );
    const data = await resp.json();
    const pageType = data.page_type || "unknown";

    if (this.blockTypes.has(pageType)) {
      return {
        allowed: false,
        action: "block",
        pageType: pageType,
        reason: `High-risk: ${pageType} page`
      };
    }

    const monitored =
      this.monitorTypes.has(pageType);
    return {
      allowed: true,
      action: monitored
        ? "allow_monitored" : "allow",
      pageType: pageType,
      reason: monitored
        ? `Monitored: ${pageType} page`
        : `Safe: ${pageType} page`
    };
  }
}

// Guard agent navigation with page-type awareness
const guard = new PageTypePolicyGuard("your_key");
const decision = await guard.classifyAndEvaluate(
  "https://saas-app.com/settings/billing"
);
console.log(decision);
// { allowed: false, action: "block",
//   pageType: "settings",
//   reason: "High-risk: settings page" }

Pre-Classified Page-Type URLs

Why Pre-Classified URLs for 102M Domains
Changes Everything for AI Agents

Having pre-classified URLs for 20 page types across 102 million domains at the start of any agent task means your agents skip the discovery phase entirely. The result: orders of magnitude faster task completion.

Orders of Magnitude Faster

Without pre-classified data, an agent must crawl each domain, follow links, load pages, and analyze content to find a login or pricing page. That takes seconds to minutes per domain. With our database, the agent gets the exact URL in under 1ms — a local lookup instead of a live crawl.

From minutes per domain to microseconds

Dramatically Lower Cost

Live crawling and AI classification at runtime burns tokens, compute, and API calls. Every page an agent visits to discover structure costs $0.01–$0.05 in LLM inference. Multiply by thousands of domains and the bill explodes. A one-time database purchase eliminates all per-query classification costs.

One-time cost vs. per-query billing

Zero Hallucination Risk

When agents guess URLs, they hallucinate. An LLM asked to find a company's pricing page might fabricate /pricing, /plans, or /packages — none of which exist. Our database provides verified, real URLs that were actually discovered and classified, eliminating hallucinated navigation entirely.

Verified URLs, not AI guesses

1000x faster lookups

Zero per-query cost

100% verified URLs

AI Agent Database Pricing

Purpose-built domain databases for AI agent filtering. Includes IAB categories, 20+ page types, reputation scores, and popularity rankings. One-time purchase with perpetual license.

AI Agent Database

AI Agent Domain Database 10M

$7,999

10 Million Domains with Page-Type Intelligence

One-time purchase: Perpetual license | Optional Updates: $1,599/year

10M+ Categorized Domains
IAB Taxonomies v2 & v3
20+ Page Type Labels
Web Filtering Categories
OpenPageRank Scores
Global Popularity Rankings

Get AI Agent DB 10M

Popular

AI Agent Domain Database 20M

$14,999

20 Million Domains with Full Intelligence Suite

One-time purchase: Perpetual license | Optional Updates: $2,999/year

20M+ Categorized Domains
IAB Taxonomies v2 & v3
20+ Page Type Labels
Web Filtering Categories
OpenPageRank Scores
Global & Country Rankings
Dedicated Account Manager

Get AI Agent DB 20M

Maximum Coverage

AI Agent Domain Database 50M

$24,999

50 Million Domains with Complete Intelligence Suite

One-time purchase: Perpetual license | Optional Updates: $4,999/year

50M+ Categorized Domains
IAB Taxonomies v2 & v3
20+ Page Type Labels
Web Filtering Categories
OpenPageRank Scores
Global & Country Rankings
Dedicated Account Manager

Get AI Agent DB 50M

Also available: Enterprise URL Database up to 102M domains from $2,499. View all database tiers →

Why Page-Type Classification Is the Missing Piece in Agent Governance

Most conversations about AI agent web governance focus on domain categories — blocking adult sites, restricting access to malware domains, limiting agents to specific industry verticals. These category-level controls are necessary but insufficient. The real risk in agent navigation is not the domain; it is the page. An agent visiting bloomberg.com is fine. An agent visiting bloomberg.com/login is a problem. An agent visiting your-internal-app.com/admin is a crisis. Page-type classification bridges this gap by labeling each page with its functional purpose, enabling policies that operate at the page level rather than the domain level.

Consider a practical scenario: your financial research agent needs to access banking websites to compare interest rates, analyze product offerings, and track market trends. Without page-type classification, you have two choices — allow the entire "Banking" category (and accept that the agent might hit login pages) or block it (and lose legitimate research capabilities). With page-type classification, you write a single rule: "allow Banking category, block page types login, checkout, admin, settings." The agent can read publicly available product pages, blog posts, and rate tables while being prevented from interacting with authentication flows, payment systems, and administrative interfaces.

The 20+ Page Types in Our Classification System

Our database classifies pages into the following functional types, each representing a distinct interaction pattern and risk profile. Homepage: the main landing page of a domain, designed for public visitors. About: organizational information pages. Contact: pages with contact forms, email addresses, and physical addresses. Pricing: pages displaying product or service pricing. Careers: job listing and recruitment pages. Login: authentication pages requiring username and password entry. Signup: account registration pages. Checkout: payment and transaction pages. Settings: account or system configuration pages. Admin: administrative control panels. Account: user account dashboard pages. Password reset: credential recovery pages. Legal: legal notice and disclaimer pages. Privacy policy: data protection and privacy pages. Terms of service: contractual terms pages. Blog: article and content pages. Documentation: technical documentation pages. API reference: API endpoint documentation. Support: customer support and help pages. FAQ: frequently asked questions. Forum: community discussion pages. Product: product description and listing pages.

Combining Page Types with IAB Categories for Two-Dimensional Policy

The most powerful agent policies combine both dimensions: category scope and page-type restrictions. This two-dimensional model creates a policy matrix where each cell represents a specific combination of "what the site is about" and "what the page does." A financial research agent might have a policy matrix like this: Technology + any page type = allow. Finance + blog/documentation/pricing/product = allow. Finance + login/checkout/admin = block. Healthcare + any page type = block (outside scope). Adult + any page type = block (global rule). This matrix is easy to define, easy to audit, and easy to explain to compliance reviewers.

Login Page Detection: The Highest-Priority Use Case

Login pages are the most critical page type to detect and block for AI agents. When an agent encounters a login page, several dangerous scenarios can unfold. If the agent has access to stored credentials (through environment variables, secret managers, or configuration files), it may attempt to authenticate — potentially violating the target service's terms of use, triggering rate limiters, or locking legitimate accounts. Even without credentials, the agent may generate plausible-looking credential pairs from its training data and submit them, creating failed login events that trigger security alerts at the target organization.

Our login page detection identifies authentication pages by analyzing URL patterns, page structure, and form field indicators. The detection covers standard login pages (/login, /signin, /auth), SSO portals, OAuth flows, and custom authentication implementations. When the database labels a page as "login," your agent's middleware can block the navigation before any data is sent to the target server.

Checkout Page Protection: Preventing Unwanted Transactions

Checkout pages represent financial risk. An agent that reaches a checkout page might fill form fields with data from its context, potentially initiating purchases, entering credit card numbers, or submitting billing information. Even if the agent does not have access to real payment data, interacting with checkout flows can create partial orders, abandoned carts that trigger marketing emails, or fraud alerts at payment processors. Page-type classification identifies checkout pages (/checkout, /cart, /payment, /billing) and blocks agent navigation before the risk materializes.

Admin Page Protection: The Worst-Case Scenario

Administrative pages are the highest-risk interaction surface on the web. An agent that navigates to an admin panel — whether through a crawled link, a misconfigured redirect, or an adversarial prompt injection — could potentially modify system settings, access sensitive data, create or delete user accounts, or alter security configurations. Even viewing an admin page exposes organizational structure and system architecture information. Our page-type classification detects admin pages (/admin, /dashboard, /panel, /manage, /console) and enables hard-block policies that prevent any agent from reaching administrative interfaces, regardless of the domain's category.

Real-Time API for Dynamic Page-Type Classification

For pages not covered by the static database — dynamically generated URLs, single-page applications with hash routes, or newly created pages — the real-time API provides on-demand page-type classification. Submit any URL to the API and receive its page-type label within 200 milliseconds. The API analyzes URL structure, path patterns, query parameters, and known page-type indicators to determine the page's functional purpose. Use the API as a fallback for the local database, ensuring that even unknown URLs receive page-type classification before the agent navigates.

Related topics: Webpage Type Detection for Agent Controls Classify Login and Checkout Pages Detect Page Intent for Agent Control Block AI Agents from Login Pages Block Agents from Authentication Pages Stop Computer Use from Admin Pages Block Agent Form Submissions

Designing Policies with Page-Type Awareness

Start with a baseline policy that blocks all high-risk page types globally: login, signup, checkout, admin, settings, password_reset, and account. This baseline protects against the most dangerous interactions regardless of domain category. Then layer category-specific rules on top: allow your agent's target categories (Technology, Finance, News) while keeping the page-type blocks in place. Finally, add monitoring rules for medium-risk page types (contact, careers, legal) that log enhanced detail without blocking. This layered approach — global blocks, category scopes, monitoring rules — creates a comprehensive governance framework that is both protective and permissive enough for productive agent operation.

Page-Type Classification That Powers Agent Navigation Policy

The Problem: Category-Only Filtering Is Too Coarse

Categories Tell You What — Not Where or How

The Solution: Page-Type Labels Enable Surgical Policy Rules

Page-Type Classification Matrix

Page Types Mapped to Agent Policy Actions

High Risk: Always Block

Medium Risk: Log and Monitor

Low Risk: Allow Freely

Risk-Tiered Page Type Sorting

Over 10 Billion Links Individually Analyzed

Page-Type Policy Integration Code

Python — Page-Type Aware Agent Policy Engine

JavaScript — Page-Type Policy Guard

Two-Dimensional Policy Evaluation

Why Pre-Classified URLs for 102M Domains
Changes Everything for AI Agents

Orders of Magnitude Faster

Dramatically Lower Cost

Zero Hallucination Risk

AI Agent Database Pricing

How Many Domains in Each Category?

Domain Distribution by Category in Our 102M Enterprise Database

Top 50 IAB v3 Categories

Page-Type Detection Across Domains

Why Page-Type Classification Is the Missing Piece in Agent Governance

The 20+ Page Types in Our Classification System

Combining Page Types with IAB Categories for Two-Dimensional Policy

Login Page Detection: The Highest-Priority Use Case

Checkout Page Protection: Preventing Unwanted Transactions

Admin Page Protection: The Worst-Case Scenario

Real-Time API for Dynamic Page-Type Classification

Designing Policies with Page-Type Awareness

Policy Matrix Visualization

Power Your Agent Policies with Page-Type Intelligence

You are on the list!

Page-Type Classification That Powers Agent Navigation Policy

The Problem: Category-Only Filtering Is Too Coarse

Categories Tell You What — Not Where or How

The Solution: Page-Type Labels Enable Surgical Policy Rules

Page-Type Classification Matrix

Page Types Mapped to Agent Policy Actions

High Risk: Always Block

Medium Risk: Log and Monitor

Low Risk: Allow Freely

Risk-Tiered Page Type Sorting

Over 10 Billion Links Individually Analyzed

Page-Type Policy Integration Code

Python — Page-Type Aware Agent Policy Engine

JavaScript — Page-Type Policy Guard

Two-Dimensional Policy Evaluation

Why Pre-Classified URLs for 102M Domains Changes Everything for AI Agents

Orders of Magnitude Faster

Dramatically Lower Cost

Zero Hallucination Risk

AI Agent Database Pricing

How Many Domains in Each Category?

Domain Distribution by Category in Our 102M Enterprise Database

Top 50 IAB v3 Categories

Page-Type Detection Across Domains

Why Page-Type Classification Is the Missing Piece in Agent Governance

The 20+ Page Types in Our Classification System

Combining Page Types with IAB Categories for Two-Dimensional Policy

Login Page Detection: The Highest-Priority Use Case

Checkout Page Protection: Preventing Unwanted Transactions

Admin Page Protection: The Worst-Case Scenario

Real-Time API for Dynamic Page-Type Classification

Designing Policies with Page-Type Awareness

Policy Matrix Visualization

Power Your Agent Policies with Page-Type Intelligence

You are on the list!

Why Pre-Classified URLs for 102M Domains
Changes Everything for AI Agents