Secure Scraping

Give your AI agents safe internet access

The only secure gateway designed for autonomous web scraping. Prevent indirect prompt injection, block malicious sites, and control costs.

6

Threat types detected

< 10ms

Cached response time

50%+

Cost savings via caching

How It Works

Three steps to safe web access.

1

Agent requests URL

Your agent sends a URL to the Bastio API instead of fetching it directly.

2

Sandboxed security scan

We render the page in an isolated browser, scanning for injections, malware, and PII.

3

Clean content returned

You receive safe, clean Markdown or JSON, stripped of all threats.

Threat Categories

Six types of web content threats.

ThreatExampleAction
Prompt InjectionHidden instructions in page contentBlock
Credential TheftAPI keys, tokens in scraped dataRedact
Malicious URLsC2 servers, phishing domainsBlock
Code InjectionMalicious script blocksSanitize
Data Exfiltrationprocess.env leaksBlock
Fake DocumentationPoisoned API docsWarn
Block Behaviors

Three configurable response modes.

BehaviorDescriptionUse Case
blockReturn error, no contentMaximum security
sanitizeRedact threats, return safe contentDefault
warnReturn full content with threat warningsMonitoring

What's included

Complete protection for web scraping agents

From threat detection to cost control, everything you need to give your agents safe internet access.

Indirect prompt injection detection
Credential & API key redaction
Malicious URL blocking
Code injection sanitization
Data exfiltration prevention
Fake documentation detection
Per-proxy URL caching (24h TTL)
Firecrawl SDK compatibility
BYOK mode (bring your own key)
Cost tracking & analytics
Configurable block behaviors
Rate limiting per agent

Firecrawl SDK (TypeScript)

Drop-in replacement with security scanning

import Firecrawl from '@mendable/firecrawl-js';

const app = new Firecrawl({
  apiKey: process.env.BASTIO_API_KEY,
  apiUrl: "https://api.bastio.com/v1/firecrawl"
});

// Use exactly as you normally would
const result = await app.scrape(
  'https://example.com',
  {
    formats: ['markdown', 'html'],
    onlyMainContent: true
  }
);

cURL Example

POST /v1/guard/{proxyID}/scrape

curl -X POST \
  "https://api.bastio.com/v1/guard/{proxyID}/scrape" \
  -H "Authorization: Bearer bastio_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "formats": ["markdown"]
  }'

Safe Browsing

Block access to known malicious domains, C2 servers, and phishing sites. Maintain allow-lists for strict control.

Credential Protection

Detect and redact API keys, PII, and sensitive data from scraped content before it reaches your agent.

Cost Control

Intelligent caching reduces scraping costs by up to 50%. Set strict budget limits and rate limits per agent.

Secure your agents today

Don't let your AI agents become a security liability. Start protecting your infrastructure now.