Traffic Control
Spyglasses gives you powerful control over which AI agents and bots can access your website. You can block malicious scrapers, AI model trainers, and other unwanted traffic while ensuring legitimate bots like search engines can still crawl your site.
What You'll Learn
In this guide, you'll learn how to:
- Configure basic bot blocking settings
- Create custom block and allow rules
- Exclude specific paths from monitoring
- Implement traffic control on different platforms
- Use advanced pattern matching for fine-grained control
Configuration Options
Spyglasses offers several configuration options to control traffic to your website:
Block AI Model Trainers
The simplest way to protect your content from being used to train AI models is to enable the blockAiModelTrainers
option. This automatically blocks known AI model training bots like GPTBot, Claude-Bot, and others.
Custom Block Rules
Use customBlocks
to block specific bots or categories of bots. You can specify:
- Categories: Block entire categories like
category:Scraper
- Patterns: Block specific bot names like
pattern:SomeBot
- User agents: Block specific user agent strings
Custom Allow Rules
Use customAllows
to override blocks and ensure important bots can always access your site. Allow rules take precedence over block rules.
Path Exclusions
Use excludePaths
to exclude certain paths from monitoring entirely. This is useful for health checks, admin pages, or API endpoints.
Platform Implementation
(Code Configuration)
For Next.js applications and other sites that are built with code, you configure traffic control directly in your middleware code. Here's a comprehensive example:
WordPress (Plugin Interface)
For WordPress sites and other platforms that use plugins, Spyglasses provides a user-friendly admin interface to configure traffic control without touching code.
The Bot Blocking Settings interface showing the main toggle for blocking AI model trainers and category-based blocking rules. Each category (AI Visitors, AI Model Trainers, Crawler, Scraper, etc.) can be individually configured with Block or Allow settings.
Category-Based Rules
The WordPress plugin organizes bots into logical categories, making it easy to apply rules to entire groups:
- AI Visitors: Includes AI assistants like ChatGPT, Claude, and Perplexity users
- AI Model Trainers: Bots specifically designed to collect training data (GPTBot, Claude-Bot, etc.)
- Crawler: General web crawlers and search engine bots
- Scraper: Content scrapers and data collection bots
- Special Purpose: Specialized bots for specific functions
- Unknown: Unclassified bot traffic
You can quickly block or allow entire categories with a single click, and the interface provides immediate visual feedback on your current settings.
Pattern-Based Rules
For more granular control, switch to the "By Pattern" tab to manage individual bot patterns:
The pattern-based interface showing specific bot user agents with their categories and individual Block/Allow settings. Notice how GPTBot is set to "Block" while Googlebot is set to "Allow", demonstrating fine-grained control.
This view shows:
- Individual bot patterns with their exact user agent strings
- Hierarchical categorization (e.g., "AI Visitors > AI Assistants")
- Current status with clear Block/Allow indicators
- Search functionality to quickly find specific patterns
- Visual color coding - blocked items show in red, allowed items in green
The interface makes it easy to override category settings for specific bots. For example, you might block the entire "AI Model Trainers" category but allow a specific research bot that you trust.
Advanced Configuration Examples
Protecting Specific Content
Block bots from accessing your most valuable content while allowing them to crawl general pages:
E-commerce Protection
Protect product data while allowing legitimate shopping bots:
Content Publisher Setup
Ideal for blogs and news sites that want to protect their content:
Testing Your Configuration
After implementing traffic control, you can test your configuration:
- Check the Spyglasses dashboard to see which bots are being blocked
- Monitor your server logs for blocked requests
- Use browser developer tools to test excluded paths
- Verify search engine access using Google Search Console
Best Practices
Start Conservative
Begin with basic settings and gradually add more restrictive rules:
Always Allow Search Engines
Make sure legitimate search engines can access your content:
Monitor Impact
Regularly check your analytics to ensure you're not blocking legitimate traffic. The Spyglasses dashboard provides detailed reports on blocked requests.
Use Exclusions Wisely
Exclude paths that don't need protection or monitoring:
Troubleshooting
Bot Still Getting Through
If unwanted bots are still accessing your site:
- Check if they match an allow rule
- Verify your patterns are correct
- Look for new bot user agents in your logs
- Contact support for help with custom patterns
Legitimate Traffic Blocked
If you're accidentally blocking legitimate traffic:
- Add specific allow rules for important bots
- Check your custom block patterns aren't too broad
- Review your exclusion paths
- Test with debug mode enabled