Secure Scraper
Protect AI agents from web-based attacks
Firecrawl-compatible web scraping with built-in security scanning. Block indirect prompt injections, cache responses, and control which URLs your agents can access.
6
Threat categories
<Â 10ms
Cached response
50%+
Cost savings
Scan every scraped page before it reaches your agent.
Agent scrapes URL
Your agent requests web content via the Firecrawl-compatible endpoint.
Bastio scans content
Six threat categories analyzed in real-time before content is returned.
Block, sanitize, or warn
Safe content delivered instantly. Threats blocked, redacted, or flagged.
Six categories of web content attacks.
| Threat | Example | Action |
|---|---|---|
| Env Exfiltration | process.env.*, os.environ[] | Block |
| Malicious Code | exec(), spawn(), system() | Block |
| Suspicious URLs | ngrok, webhook.site, IP-based | Block |
| Fake Documentation | URGENT: Security update... | Sanitize |
| Prompt Injections | Ignore previous instructions | Sanitize |
| Jailbreak Attempts | DAN prompts, roleplay bypass | Block |
Configurable responses for detected threats.
| Action | Behavior | Use Case |
|---|---|---|
| block | Return error, no content delivered | Autonomous agents, compliance |
| sanitize | Redact threats, return safe content | Research assistants (default) |
| warn | Return full content with threat warnings | Testing, monitoring |
What's included
Security, caching, and compatibility — built in
Every scraped URL gets automatic threat scanning, intelligent caching, and domain control at no extra configuration.
Drop-in Firecrawl Replacement
Change one URL to add security scanning to your existing Firecrawl integration.
# Before (Firecrawl)
POST api.firecrawl.dev/v2/scrape
# After (Bastio)
POST api.bastio.com/v1/guard/{proxyID}/scrapePython Example
Full Firecrawl v2 API compatibility with security response.
response = requests.post(
f"https://api.bastio.com/v1/guard/{PROXY_ID}/scrape",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"url": "https://example.com", "formats": ["markdown"]}
)
result = response.json()
if result["security"]["action"] == "BLOCK":
print(f"Threat blocked: {result['security']['threats_found']}")
else:
content = result["data"]["markdown"]Intelligent Caching
Per-proxy 24-hour URL cache cuts costs by 50%+ while still scanning for threats on every request.
URL Control
Allow-lists and block-lists give defense-in-depth control over which domains your agents can access.
BYOK Mode
Bring your own Firecrawl API key for just $0.0005/URL security scanning fee.
Start securing your web scraping
100 free secure scrapes per month. Full Firecrawl compatibility.