Introduction to Website Categorization API and our other API solutions
1. Website categorization API
Our Website Categorization API provides accurate URL/webpage classification based on the widely trusted IAB Taxonomy. It also returns categories for a Website Filtering Taxonomy and relevant User Personas (that may be interested in webpage, from around 2000 User Personas). All categorizations are returned with their confidence scores.
Categorizations are done in real-time and using full-path URLs. In addition to URLs, you can also use our API to classify plain text.
We return categorizations of URLS for the following taxonomies:
- IAB, version 3, with 4 Tiers (taxonomy from Internet Advertising Bureau - IAB, with members including top global companies - Ford, IBM, Unilever, etc.) (703 categories)
- IAB, version 2, with 4 Tiers (taxonomy from Internet Advertising Bureau - IAB, with members including top global companies - Procter & Gamble, General Motors, etc.) (698 categories)
- IPTC NewsCodes, especially suitable for News Categorization (used by the world's largest news agencies, such as Associated Press, Reuters, BBC, News York Times) (1124 categories)
- Web Content Filtering Taxonomy (44 categories)
- Google Shopping Taxonomy, used by millions of online retailers (5474 categories)
- Shopify Taxonomy, used by millions of online stores (10560 categories)
- Amazon Taxonomy (39004 categories)
We also return the following classifications / enriched data about each URL:
- Detection of Malware/Social Engineering/Mailicious Software
- Web Technologies used
- Likely Buyer Personas
- Topics
- Key Entities Named
- Sentiment Analysis
- Similar companies / Competitors
- Similar domains
- Tags
- Keywords
For low latency use cases we also offer an offline categorization database with 30 million domains already classified.
It covers 99% of the active internet usage, with 18 million domains sourced from Google Chrome UX Report. Check out our 30M database here.
2. Web Technology Lookup API
For each URL we return all the web technologies used by the website among the 4000+ web technologies that we track.Authentication
Send your API requests to: https://www.websitecategorizationapi.com/api/
You must have a valid API key, available by purchasing a subscription. After obtaining your plan, log in to retrieve the key.
The API key should be included in all requests as a parameter, for example:
api_key: b4dade2ce5fb2d0b189b5eb6f0cd
Successful requests return 200. Results are in JSON by default.
Rules and Limits
Credit Monitoring
The API response includes both total_credits and remaining_credits. Monitor your remaining credits. If you run out, you can purchase more or upgrade your plan.
Flexible Scaling
Easily upgrade your plan or purchase additional credits through our pricing page when needed.
Website Categorization of URLs
The default API classifies URLs according to the following major taxonomies:
- IAB (v3) returned under iab_taxonomy
- IAB (v2) returned under iab_taxonomy_version2
- Web Filtering returned under filtering_taxonomy
Example request (Curl POST):
curl -X POST -H 'Content-Type: application/x-www-form-urlencoded' \
-d 'query=www.apple.com&api_key=your_api_key&data_type=url&confidence=1&expanded_categories=1' \
'https://www.websitecategorizationapi.com/api/iab/iab_web_content_filtering.php'
Query Parameters
| Parameter | Type | Description |
|---|---|---|
| query | string | The URL or text to be categorized. |
| api_key | string | Your API key. |
| data_type | string | "url" or "text". |
| confidence | string | set to value 1 to obtain confidence scores of User Personas |
| expanded_categories | string | set to value 1 to obtain enrichment data and malware detection |
| expanded_categories | string | set to value 1 to obtain enrichment data and malware detection, set this to "malware_social_engineering_malicious_software_detection" to obtain only malware detection and not enriched categories |
| use_domain_as_basis_of_categorization_for_insufficient_subdomain_content | string | set to value 1 to use domain as basis of categorization in cases where you are passing as URL a subdomain (e.g. api.deepl.com) but it has no content or insufficient content (e.g. 404 error). If you use this parameter set to 1, then API will return categorization of such subdomain based on content of corresponding root domain, in this case deepl.com |
{
"iab_classification": [
[
"Category name: Technology & Computing > Computing > Computer Software and Applications > Operating Systems",
"Confidence: 1.0"
],
[
"Category name: Technology & Computing > Consumer Electronics",
"Confidence: 0.5414350628852844"
],
[
"Category name: Technology & Computing > Consumer Electronics > Smartphones",
"Confidence: 0.313300222158432"
],
[
"Category name: Technology & Computing > Computing",
"Confidence: 0.19400359690189362"
],
[
"Category name: Business and Finance > Business",
"Confidence: 0.19046887755393982"
],
[
"Category name: Technology & Computing > Computing > Laptops",
"Confidence: 0.13031959533691406"
],
[
"Category name: Technology & Computing > Computing > Computer Peripherals",
"Confidence: 0.10443061590194702"
]
],
"filtering_taxonomy": [
[
"Category name: Computers & Technology",
"Confidence: 1.0"
]
],
"status": 200,
"buyer_personas_confidence_selection": {
"Smartphone Enthusiast": 1,
"Tech Enthusiast": 0.9,
"Gadget Enthusiast": 0.9,
"Technology Enthusiast": 0.9,
"Consumer Advocate": 0.7,
"Software Developer": 0.6,
"IT Professional": 0.6,
"Digital Marketer": 0.6,
"Business Owner": 0.6,
"E-commerce Entrepreneur": 0.6,
"Cloud Computing Specialist": 0.5,
"Mobile App Developer": 0.5,
"AI Technologist": 0.5,
"Web Developer": 0.5,
"Startup Enthusiast": 0.5,
"Application Developer": 0.5,
"Hardware Engineer": 0.5,
"Cybersecurity Expert": 0.4,
"IoT Enthusiast": 0.4,
"Data Scientist": 0.4,
"DevOps Engineer": 0.4,
"Information Systems Manager": 0.4,
"Consulting Enthusiast": 0.4,
"Backend Developer": 0.4,
"Data Analyst": 0.4,
"Full Stack Developer": 0.4,
"Network Administrator": 0.4,
"Database Administrator": 0.3,
"Financial Analyst": 0.3,
"Financial Advisor": 0.3,
"Investor": 0.3,
"Economic Analyst": 0.3,
"Portfolio Manager": 0.3
},
"technologies_website": {
"https://www.apple.com/": {
"status": 200
}
},
"data": {
"Topics": [
[
"MacBook Air performance",
"Highlighted sky high performance with M4 chip"
],
[
"Education savings",
"Promotion for buying Mac or iPad for college"
],
[
"iPhone 16 family",
"Introduction and shop link for iPhone 16 models"
],
[
"Apple Intelligence",
"Repeated "built for apple intelligence" branding"
],
[
"Trade-in program",
"$170–$630 credit when trading in eligible iPhone"
],
[
"Apple Card benefits",
"Up to 3% daily cash back"
],
[
"Entertainment services",
"Apple TV+, Apple Music, Podcasts etc"
],
[
"Accessibility features",
"AirPods Pro 2 hearing aid and protection"
]
],
"Key Named Entities": [
[
"Apple Inc.",
"Organization"
],
[
"MacBook Air",
"Product"
],
[
"iPhone 16",
"Product"
],
[
"AirPods Pro 2",
"Product"
],
[
"Goldman Sachs Bank USA",
"Organization"
],
[
"MLB Advanced Media, L.P.",
"Organization"
],
[
"FDA",
"Organization"
]
],
"Likely Buyer Personas": [
[
"College Students",
"Education discount offers for Mac and iPad"
],
[
"Tech Enthusiasts",
"Interest in latest hardware and Apple Intelligence"
],
[
"Audio Users",
"Hearing aid and protection features in AirPods Pro 2"
],
[
"Creative Professionals",
"MacBook Pro performance for work"
],
[
"Entertainment Consumers",
"Subscriptions to Apple TV+ and Apple Music"
]
],
"Related Keywords": [
"macbook air",
"iphone 16",
"airpods pro 2",
"apple intelligence",
"trade-in",
"apple card",
"apple tv+",
"apple music",
"hearing aid feature",
"education savings"
],
"Similar companies": [
[
"Samsung",
"Competes in smartphones and tablets"
],
[
"Google",
"Competes with Pixel devices and services"
],
[
"Microsoft",
"Competes with Surface devices and Windows ecosystem"
],
[
"Bose",
"Competes in audio headset market"
],
[
"Netflix",
"Competes with Apple TV+ streaming service"
]
],
"Tags": [
"Apple",
"MacBook",
"iPhone",
"iPad",
"AirPods",
"Apple Intelligence",
"Trade-In",
"Apple Card",
"Entertainment",
"Accessibility"
],
"Social media profiles": [
"No social media profiles present in page content"
],
"Likely Audience Demographics": [
[
"Ages 18–24",
"Targeted by education promotions and hearing features"
],
[
"Ages 25–45",
"Professionals and tech enthusiasts"
],
[
"Students",
"Education savings on devices"
],
[
"Hearing-impaired adults",
"AirPods Pro 2 hearing aid feature"
],
[
"Entertainment fans",
"Apple TV+ and Music services"
]
],
"Sentiment Analysis": [
[
"Apple Inc.",
"Neutral"
],
[
"MacBook Air",
"Positive"
],
[
"iPhone 16",
"Positive"
],
[
"AirPods Pro 2",
"Positive"
],
[
"Apple Card",
"Neutral"
]
],
"Language": [
"English"
],
"Legal Entity & Address": [
[
"Apple Inc.",
"Legal entity name found in footer; no address present"
]
]
},
"malware": "No detection of Malware/Social Engineering/Malicious_Software",
"technologies": [
{
"slug": "cart-functionality",
"name": "Cart Functionality",
"description": "Websites that have a shopping cart or checkout page, either using a known ecommerce platform or a custom solution.",
"confidence": 100,
"version": null,
"icon": "Cart-generic.svg",
"website": "",
"cpe": null,
"categories": [
{
"id": 6,
"slug": "ecommerce",
"name": "Ecommerce"
}
],
"rootPath": true
},
{
"slug": "apple-mapkit-js",
"name": "Apple MapKit JS",
"description": "Apple MapKit JS lets you embed interactive maps directly into your websites across platforms and operating systems, including iOS and Android.",
"confidence": 100,
"version": null,
"icon": "Apple.svg",
"website": "https://developer.apple.com/maps/web/",
"cpe": null,
"categories": [
{
"id": 35,
"slug": "maps",
"name": "Maps"
}
],
"rootPath": true
},
{
"slug": "adobe-target",
"name": "Adobe Target",
"description": "Adobe Target is an A/B testing, multi-variate testing, personalisation, and optimisation application",
"confidence": 100,
"version": "2.3.2",
"icon": "Adobe.svg",
"website": "https://www.adobe.com/marketing/target.html",
"cpe": null,
"categories": [
{
"id": 74,
"slug": "a-b-testing",
"name": "A/B Testing"
},
{
"id": 76,
"slug": "personalisation",
"name": "Personalisation"
}
],
"rootPath": true
},
{
"slug": "adobe-analytics",
"name": "Adobe Analytics",
"description": "Adobe Analytics is a web analytics, marketing and cross-channel analytics application.",
"confidence": 100,
"version": null,
"icon": "Adobe Analytics.svg",
"website": "https://www.adobe.com/analytics/adobe-analytics.html",
"cpe": null,
"categories": [
{
"id": 10,
"slug": "analytics",
"name": "Analytics"
}
],
"rootPath": true
},
{
"slug": "preact",
"name": "Preact",
"description": "Preact is a JavaScript library that describes itself as a fast 3kB alternative to React with the same ES6 API.",
"confidence": 100,
"version": null,
"icon": "Preact.svg",
"website": "https://preactjs.com",
"cpe": null,
"categories": [
{
"id": 59,
"slug": "javascript-libraries",
"name": "JavaScript libraries"
}
],
"rootPath": true
},
{
"slug": "hsts",
"name": "HSTS",
"description": "HTTP Strict Transport Security (HSTS) informs browsers that the site should only be accessed using HTTPS.",
"confidence": 100,
"version": null,
"icon": "default.svg",
"website": "https://www.rfc-editor.org/rfc/rfc6797#section-6.1",
"cpe": null,
"categories": [
{
"id": 16,
"slug": "security",
"name": "Security"
}
],
"rootPath": true
},
{
"slug": "open-graph",
"name": "Open Graph",
"description": "Open Graph is a protocol that is used to integrate any web page into the social graph.",
"confidence": 100,
"version": null,
"icon": "Open Graph.png",
"website": "https://ogp.me",
"cpe": null,
"categories": [
{
"id": 19,
"slug": "miscellaneous",
"name": "Miscellaneous"
}
],
"rootPath": true
}
]
}
Example Code
Below are production-ready code examples for our API in various languages with error handling and best practices.
Example Code in Python
import http.client
conn = http.client.HTTPSConnection("www.websitecategorizationapi.com")
payload = 'query=www.alpha-quantum.com&api_key=your_api_key&data_type=url'
headers = {
'Content-Type': 'application/x-www-form-urlencoded'
}
conn.request("POST", "/api/iab/iab_web_content_filtering.php", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))
Example Code in JavaScript
var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/x-www-form-urlencoded");
var urlencoded = new URLSearchParams();
urlencoded.append("query", "www.alpha-quantum.com");
urlencoded.append("api_key", "your_api_key");
urlencoded.append("data_type", "url");
var requestOptions = {
method: 'POST',
headers: myHeaders,
body: urlencoded,
redirect: 'follow'
};
fetch("https://www.websitecategorizationapi.com/api/iab/iab_web_content_filtering.php", requestOptions)
.then(response => response.text())
.then(result => console.log(result))
.catch(error => console.log('error', error));
Example Code in Ruby
require 'net/http'
require 'uri'
uri = URI("https://www.websitecategorizationapi.com/api/iab/iab_web_content_filtering.php")
request = Net::HTTP::Post.new(uri)
request["Content-Type"] = "application/x-www-form-urlencoded"
payload = "query=www.alpha-quantum.com&api_key=your_api_key&data_type=url"
request.body = payload
response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http|
http.request(request)
end
puts response.body
Example Code in PHP
$apiKey = "your_api_key";
$query = "www.alpha-quantum.com";
$dataType = "url";
$url = "https://www.websitecategorizationapi.com/api/iab/iab_web_content_filtering.php";
$postData = http_build_query([
'query' => $query,
'api_key' => $apiKey,
'data_type' => $dataType
]);
$options = [
"http" => [
"header" => "Content-Type: application/x-www-form-urlencoded\r\n",
"method" => "POST",
"content" => $postData,
],
];
$context = stream_context_create($options);
$response = file_get_contents($url, false, $context);
echo $response;
?>
Example Code in C#
using System;
using System.Net.Http;
using System.Threading.Tasks;
using System.Collections.Generic;
class Program
{
static async Task Main(string[] args)
{
var apiKey = "your_api_key";
var query = "www.alpha-quantum.com";
var dataType = "url";
var url = "https://www.websitecategorizationapi.com/api/iab/iab_web_content_filtering.php";
using var client = new HttpClient();
var data = new FormUrlEncodedContent(new Dictionary
{
{ "query", query },
{ "api_key", apiKey },
{ "data_type", dataType }
});
var response = await client.PostAsync(url, data);
var result = await response.Content.ReadAsStringAsync();
Console.WriteLine(result);
}
}
Website Categorization of Texts
To categorize plain text instead of a URL, set data_type=text and call: /api/iab/iab_content_filtering.php.
Batch Processing API
Our Batch Processing API allows you to submit large collections of URLs for categorization in a single request. This is perfect for:
- Large-scale analysis - Process thousands of URLs at once
- Periodic batch jobs - Regular categorization of domain lists
- Data migration - Categorizing existing URL databases
- Research projects - Academic or commercial research requiring bulk categorization
Key Features
Flexible Input
Support for CSV, JSON, and TXT file formats. Upload files up to 50MB with up to 50,000 URLs per batch.
Real-time Tracking
Monitor job progress with detailed status information, processing logs, and estimated completion times.
Quick Start
1. Submit a batch job:
curl -X POST \
-H 'Content-Type: multipart/form-data' \
-F 'api_key=your_api_key' \
-F '[email protected]' \
'https://www.websitecategorizationapi.com/api/batch/batch_upload.php'
2. Check job status:
curl 'https://www.websitecategorizationapi.com/api/batch/status.php?job_id=YOUR_JOB_ID&api_key=your_api_key'
3. Download results:
curl -O 'https://www.websitecategorizationapi.com/api/batch/download.php?job_id=YOUR_JOB_ID&format=json&api_key=your_api_key'
Supported File Formats
| Format | Description | Example |
|---|---|---|
| CSV | Comma-separated values, one URL per row | example.com google.com |
| JSON | JSON array or object with domains array | ["example.com", "google.com"] |
| TXT | Plain text, one URL per line | example.com google.com |
Errors
Each response includes a status value. For example: {"classification": "...", "status": 200, ...}
Possible error codes:
| Error Code | Meaning |
|---|---|
| 200 | Request was successful. |
| 400 | Error forming request. Check parameters. |
| 401 | Invalid API key. Purchase or check for typos. |
| 403 | Monthly quota used up. Upgrade or buy more credits. |
| 407 | Missing data_type. Must be "url" or "text". |
| 410 | Insufficient tokens in content / URL could not be loaded. |
| 411 | URL content could not be fetched. |
| 500 | General error. Check request or contact support. |
Specialized APIs
We provide 50+ advanced APIs for specialized use cases. Contact us at [email protected] for setup and integration support.
Content Analysis APIs
Analyze, classify, and extract insights from web content
Sentiment Analysis
Analyzes text or webpages to determine positive, negative, or neutral sentiment for brand alignment.
Keyword & Entity Extraction
Identifies key terms and named entities (people, places, organizations) for SEO and ad targeting.
Content Summarization
Generates concise summaries of articles or webpages for content vetting and snippet previews.
Topic Modeling
Clusters documents based on underlying themes for refined content categorization.
Contextual Keyword Suggestion
Suggests relevant keywords based on webpage context for ad matching.
Language Detection & Translation
Identifies language and provides instant translations for global audience reach.
Emotional Tone Analysis
Detects emotional cues (joy, anger, sadness, fear) for emotional context alignment.
Narrative Tone Shift Detection
Detects tone changes throughout articles to identify controversial sections.
Topic Sentiment Over Time
Tracks sentiment evolution over days, weeks, or months for reputation management.
Brand Safety & Compliance APIs
Protect your brand with content verification and compliance tools
Clickbait Detection
Scores headlines for "clickbait-ness" to filter sensational or misleading content.
Fake News Detection
Evaluates content credibility using linguistic patterns and fact-checker cross-referencing.
Children's Content Compliance
Evaluates COPPA compliance for child-safe advertising spaces.
Live Content Moderation
Real-time filtering of user-generated content based on brand safety guidelines.
Link Reputation
Analyzes trustworthiness of external links to limit association with dubious sites.
Regulatory Compliance
Identifies GDPR, CCPA, and region-specific compliance issues.
Authorship Credibility Scoring
Assesses author reliability through historical content and social footprint analysis.
Outbound Link Categorization
Classifies outbound links to identify brand-safe, malicious, or competitor sites.
Plagiarism & Duplicate Content
Flags duplicate or plagiarized content to maintain editorial integrity.
Audience & User Analysis APIs
Understand and segment your audience for better targeting
User Persona Prediction
Predicts psychographic traits like lifestyle interests and buying behaviors.
Audience Segmentation
Groups visitors based on behaviors or interests for tailored ad placements.
User Feedback Analysis
Analyzes comments and reviews to uncover sentiment trends and satisfaction levels.
Lookalike Audience
Finds users with similar behavior patterns to high-value audience segments.
Psycholinguistic Profiling
Analyzes language use to infer personality or psychological traits.
User Intent & Micro-Moment
Identifies high purchase intent moments based on real-time context signals.
Customer Journey Mapping
Tracks user navigation patterns to identify drop-off points and optimize funnels.
Advertising & Marketing APIs
Optimize campaigns, detect fraud, and improve conversions
Ad Fraud Detection
Identifies suspicious ad traffic and bot-generated clicks in real time.
Conversion Probability
ML-powered scoring of user conversion likelihood based on context and behavior.
A/B Testing & Optimization
Automates ad variation serving and determines most effective variants.
Dynamic Retargeting
Reassesses user interest based on new context to adjust retargeting campaigns.
Ad Creative Generation
Suggests ad creative variations based on product details and audience insights.
Viewability Detection
Detects if ads are actually seen by users for dynamic bidding support.
Engagement Prediction
Scores content likelihood to generate comments, likes, or shares.
Trend Prediction
Forecasts trending topics or products using historical and real-time data.
Social Media Listening
Aggregates and analyzes social media posts for brand mention insights.
Influencer Quality Analysis
Evaluates influencer authenticity, engagement quality, and demographics.
Omnichannel Consistency
Ensures consistent messaging across web, mobile, app, and email channels.
Adaptive Content Personalization
Real-time AI-driven content adjustments based on user behavior signals.
Content Recommendation
Recommends articles or products based on browsing history for engagement.
E-commerce APIs
Power your online store with intelligent categorization and competitive insights
Product Categorization
Classifies products into hierarchical taxonomies for niche e-commerce verticals.
Competitor Pricing Monitoring
Tracks competitor price changes for dynamic pricing strategies.
Competitor Content Analysis
Monitors competitor websites for new content and promotions.
Geolocation & Cultural Relevance
Ensures cultural/geographic relevance for global campaigns.
Technical & SEO APIs
Optimize technical performance, SEO, and user experience
SEO Health & Score
Evaluates metadata, keyword density, page speed for organic ranking optimization.
Accessibility Compliance
Scans pages for WCAG compliance to meet accessibility standards.
Web Page Layout & UX Scoring
Scores usability of design and layout including navigation and ad placement.
Microcopy & UX Writing
Generates or refines short text prompts for improved clarity and user flow.
Multimedia APIs
Analyze images, videos, and audio content
Multimedia Content Classification
Classifies images, videos, and audio for holistic brand safety checks.
Speech-to-Text Analysis
Converts audio to text and applies sentiment/classification models.
In-image Ad Contextualization
Analyzes image content to determine best ad overlay alignment.
Brand Logo Detection
Detects brand logos in images/videos to confirm visibility or conflicts.
Ready to Integrate?
Each API can stand alone or integrate with your existing stack. Combine multiple APIs to build a robust ecosystem covering brand safety, content relevance, and user engagement.
Contact Us for Setup