Bastio
Integration Guide

Firecrawl SDK Compatibility

Use the official Firecrawl SDK with Bastio for secure web scraping.

Using Firecrawl SDK with Bastio

Bastio provides Firecrawl SDK compatibility for developers who prefer using the official Firecrawl SDK. This is ideal for existing Firecrawl users who want to add Bastio's security scanning to protect their AI agents from indirect prompt injections in web content.

Note: This is a secondary integration option. The primary and recommended approach is using Bastio's direct HTTP API at /v1/guard/:proxyID/scrape, which provides full control over security options and extended response data.

Prerequisites

  • Firecrawl account with an API key (BYOK mode)
  • Bastio account with at least one proxy configured

Setup

Step 1: Create a Proxy with Firecrawl BYOK Key

  1. Go to the Bastio Dashboard
  2. Navigate to Proxies > Create New Proxy
  3. Configure your proxy settings
  4. In the Scraper Configuration section:
    • Enable Bring Your Own Key (BYOK)
    • Enter your Firecrawl API Key
  5. Save the proxy

Step 2: Create a Scoped API Key

The Firecrawl SDK does not include a proxy ID in requests, so you need an API key scoped to exactly one proxy:

  1. Go to API Keys > Create New Key
  2. Set Access Type to Specific Proxy
  3. Select the proxy you configured in Step 1
  4. Give the key a descriptive name (e.g., "Firecrawl SDK Key")
  5. Click Create and copy the key immediately (you will only see it once)

Step 3: Configure the Firecrawl SDK

JavaScript/TypeScript

import Firecrawl from '@mendable/firecrawl-js';

const app = new Firecrawl({
  apiKey: 'sk_bastio_xxx',  // Your Bastio API key (scoped to one proxy)
  apiUrl: 'https://api.bastio.com/v1/firecrawl'
});

// Works exactly like the Firecrawl SDK
const result = await app.scrape('https://example.com', {
  formats: ['markdown', 'html']
});

console.log(result.data?.markdown);

Python

from firecrawl import FirecrawlApp

app = FirecrawlApp(
    api_key='sk_bastio_xxx',  # Your Bastio API key (scoped to one proxy)
    api_url='https://api.bastio.com/v1/firecrawl'
)

# Works exactly like the Firecrawl SDK
result = app.scrape_url('https://example.com', params={
    'formats': ['markdown', 'html']
})

print(result.get('data', {}).get('markdown'))

Currently Supported Endpoints

EndpointSDK MethodStatus
/v1/scrapescrape() / scrape_url()Supported
/v1/crawlcrawl() / crawl_url()Coming Soon
/v1/mapmap() / map_url()Coming Soon

How Security Scanning Works

When you use the Firecrawl SDK with Bastio:

  1. Request Processing: Your scrape request is received by Bastio
  2. Firecrawl Call: Bastio forwards the request to Firecrawl using your BYOK key
  3. Security Scan: The returned content is scanned for threats (prompt injections, malicious code, etc.)
  4. Response: If safe, you receive the standard Firecrawl response. If threats are detected:
    • High threat (score >= 0.7): Request is blocked with an error
    • Medium threat (score 0.3-0.7): Request proceeds (logged for monitoring)
    • Low threat (score < 0.3): Request proceeds normally

Security scanning is transparent - you get standard Firecrawl responses without extra fields.

SDK vs Direct API Comparison

FeatureFirecrawl SDKDirect API
Endpoint/v1/firecrawl/scrape/v1/guard/:proxyID/scrape
Proxy SelectionOne proxy per API keyAny proxy per request
Response FormatStandard FirecrawlExtended with security info
Security OptionsDefault behavior (sanitize)Full customization
Block BehaviorBlock high-threat onlyConfigurable (block/sanitize/warn)
Best ForExisting Firecrawl usersNew implementations

Error Handling

The SDK endpoint returns Firecrawl-compatible errors:

{
  "success": false,
  "error": "Description of what went wrong"
}

Common error scenarios:

  • 401 Unauthorized: Invalid or missing API key
  • 400 Bad Request: API key not scoped to exactly one proxy
  • 403 Forbidden: Domain blocked by policy or content blocked due to security threats
  • 429 Too Many Requests: Monthly scrape limit exceeded (FREE tier)
  • 502 Bad Gateway: Firecrawl service unavailable

Rate Limits

Rate limits are inherited from your API key configuration:

  • Default: 100 requests/minute
  • FREE tier: 100 scrapes/month total

Domain Restrictions

If you have configured domain allow/block lists on your proxy, they apply to SDK requests too:

  • Allowlist mode: Only URLs matching your allowlist are permitted
  • Blocklist mode: URLs matching your blocklist are rejected

Troubleshooting

API key not scoped to specific proxy

Your API key has access to multiple proxies or all proxies. Create a new API key with:

  • Access Type: Specific Proxy
  • Proxy: Select exactly one proxy

Scraping service unavailable

Your proxy does not have a Firecrawl BYOK key configured. Edit your proxy and add your Firecrawl API key in the Scraper Configuration section.

Content blocked due to security threats

The scraped content contained high-severity threats (score >= 0.7). This is Bastio protecting your AI agent from potentially malicious content. You can:

  • Review the URL manually to verify it is safe
  • Contact support if you believe this is a false positive
  • Use the direct API with block_behavior: "warn" to receive content with warnings instead

Migration from Firecrawl

If you are currently using Firecrawl directly:

  1. Keep your Firecrawl API key (you will use it as BYOK)
  2. Create a Bastio account and configure a proxy with your Firecrawl key
  3. Create a scoped Bastio API key
  4. Update your SDK initialization:
// Before (direct Firecrawl)
const app = new Firecrawl({ apiKey: 'fc-xxx' });

// After (Bastio with security scanning)
const app = new Firecrawl({
  apiKey: 'sk_bastio_xxx',
  apiUrl: 'https://api.bastio.com/v1/firecrawl'
});

Your existing code that uses app.scrape() will continue to work unchanged.