Bastio
Providers

Azure AI Foundry Integration

Complete guide to using Azure AI Foundry with Bastio for OpenAI, Llama, Mistral, DeepSeek, and Microsoft models.

Azure AI Foundry

Access OpenAI, Meta Llama, Mistral AI, DeepSeek, and Microsoft models through a single Azure credential with Bastio's full security protection.

Overview

Azure AI Foundry (formerly Azure AI) provides a unified gateway to multiple AI providers through Microsoft Azure's infrastructure. With Bastio, you can:

  • One credential, five vendors - Access OpenAI, Meta, Mistral, DeepSeek, and Microsoft models with a single Azure API key
  • Enterprise-grade security - Azure compliance certifications, VNet integration, private endpoints
  • Unified billing - All usage consolidated in your Azure subscription
  • Full security coverage - All Bastio security features work across all providers
  • OpenAI-compatible API - Same API format for all models, no code changes needed

Why Azure AI Foundry?

Unlike direct provider integrations, Azure AI Foundry offers unique advantages:

FeatureDirect ProvidersAzure AI Foundry
Credentials needed5 separate API keys1 Azure API key
Billing5 separate invoices1 Azure invoice
ComplianceVaries by providerAzure certifications (SOC, HIPAA, ISO)
Network securityPublic internetVNet, private endpoints
API formatDifferent per providerUnified OpenAI-compatible
Model accessIndividual agreementsModel Catalog marketplace

Supported Models

OpenAI Models (via Azure OpenAI)

Native OpenAI models available through Azure OpenAI Service:

GPT-4o Family

ModelContextMax OutputInput PriceOutput PriceVisionTools
gpt-4o128K tokens16K tokens$2.50/1M$10.00/1MYesYes
gpt-4o-2024-11-20128K tokens16K tokens$2.50/1M$10.00/1MYesYes
gpt-4o-mini128K tokens16K tokens$0.15/1M$0.60/1MYesYes

GPT-4 Turbo

ModelContextMax OutputInput PriceOutput PriceVisionTools
gpt-4-turbo128K tokens4K tokens$10.00/1M$30.00/1MYesYes

GPT-4 Base

ModelContextMax OutputInput PriceOutput PriceVisionTools
gpt-48K tokens8K tokens$30.00/1M$60.00/1MNoYes
gpt-4-32k32K tokens32K tokens$60.00/1M$120.00/1MNoYes

GPT-3.5 Turbo

ModelContextMax OutputInput PriceOutput Price
gpt-35-turbo16K tokens4K tokens$0.50/1M$1.50/1M
gpt-35-turbo-16k16K tokens16K tokens$3.00/1M$4.00/1M

o1 Reasoning Models

ModelContextMax OutputInput PriceOutput Price
o1-preview128K tokens32K tokens$15.00/1M$60.00/1M
o1-mini128K tokens65K tokens$3.00/1M$12.00/1M

Meta Llama Models (via Model Catalog)

ModelContextMax OutputInput PriceOutput PriceVisionTools
Meta-Llama-3.1-405B-Instruct128K tokens4K tokens$5.33/1M$16.00/1MNoYes
Meta-Llama-3.1-70B-Instruct128K tokens4K tokens$2.68/1M$3.54/1MNoYes
Meta-Llama-3.1-8B-Instruct128K tokens4K tokens$0.30/1M$0.61/1MNoYes
Llama-3.2-90B-Vision-Instruct128K tokens4K tokens$2.00/1M$2.00/1MYesNo
Llama-3.2-11B-Vision-Instruct128K tokens4K tokens$0.37/1M$0.37/1MYesNo

Mistral AI Models (via Model Catalog)

ModelContextMax OutputInput PriceOutput PriceTools
Mistral-Large-2407128K tokens8K tokens$2.00/1M$6.00/1MYes
Mistral-Small128K tokens8K tokens$1.00/1M$3.00/1MYes
Codestral-240532K tokens8K tokens$1.00/1M$3.00/1MNo
Mistral-Nemo128K tokens8K tokens$0.30/1M$0.30/1MYes

DeepSeek Models (via Model Catalog)

ModelContextMax OutputInput PriceOutput Price
DeepSeek-R164K tokens8K tokens$0.55/1M$2.19/1M
DeepSeek-V3-032464K tokens8K tokens$0.27/1M$1.10/1M

Microsoft Models

ModelContextMax OutputInput PriceOutput Price
Phi-416K tokens4K tokens$0.125/1M$0.50/1M
Phi-3.5-mini-instruct128K tokens4K tokens$0.13/1M$0.52/1M

Embedding Models

ModelDimensionsInput Price
text-embedding-3-large3072$0.13/1M
text-embedding-3-small1536$0.02/1M
text-embedding-ada-0021536$0.10/1M

Quick Start

Prerequisites

  1. Azure account with active subscription
  2. Azure AI Hub or Azure OpenAI resource
  3. Models deployed in your project
  4. API key from Azure Portal

Step 1: Create Azure AI Resource

  1. Go to the Azure Portal
  2. Click Create a resource > AI + Machine Learning
  3. Choose Azure AI Hub (for all models) or Azure OpenAI (for OpenAI models only)
  4. Select your subscription and resource group
  5. Choose a region (e.g., East US, West Europe)
  6. Click Review + create > Create

Step 2: Deploy Models

For Azure OpenAI Models:

  1. Go to your Azure OpenAI resource
  2. Click Model deployments > Create new deployment
  3. Select the model (e.g., gpt-4o)
  4. Name your deployment (e.g., gpt-4o-production)
  5. Set tokens-per-minute quota
  6. Click Create

For Model Catalog Models (Llama, Mistral, DeepSeek):

  1. Go to Azure AI Studio
  2. Navigate to Model Catalog
  3. Search for your model (e.g., Llama-3.1-70B)
  4. Click Deploy > Serverless API
  5. Accept terms and conditions
  6. Click Deploy

Step 3: Get API Credentials

  1. Go to your Azure AI resource in the Portal
  2. Navigate to Keys and Endpoint
  3. Copy KEY 1 or KEY 2
  4. Copy the Endpoint URL
  5. Note your resource name (the first part of the endpoint URL)

Step 4: Configure in Bastio

  1. Go to Dashboard > Proxies > Create New Proxy
  2. Select Azure AI Foundry as provider
  3. Enter your credentials as JSON (see Credential Format below)
  4. Click Create Proxy

BYOK Mode (Bring Your Own Key)

Use your own Azure credentials with Bastio.

Via Dashboard

  1. Go to Dashboard > Proxies > Create New Proxy
  2. Select Azure AI Foundry as provider
  3. Choose Your API Keys (BYOK) mode
  4. Enter your Azure credentials as JSON:
{
  "resource_name": "my-azure-ai",
  "api_key": "your-api-key-here",
  "api_version": "2024-10-21",
  "endpoint_type": "inference"
}
  1. Click Create Proxy

Via API

# Create Azure AI Foundry proxy
curl -X POST https://api.bastio.ai/proxy \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Production Azure AI",
    "provider": "azure",
    "llm_mode": "byok",
    "model_behavior": "passthrough"
  }'

# Add Azure credentials
curl -X POST https://api.bastio.ai/keys/provider \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "azure",
    "key_name": "Azure Production",
    "api_key": "{\"resource_name\":\"my-azure-ai\",\"api_key\":\"xxx\",\"api_version\":\"2024-10-21\",\"endpoint_type\":\"inference\"}"
  }'

Model Deployment Guides

Deploying OpenAI Models

Step 1: Access Azure OpenAI

  1. Go to Azure Portal
  2. Navigate to your Azure OpenAI resource
  3. Click Go to Azure OpenAI Studio

Step 2: Create Deployment

  1. Click Deployments > Create new deployment
  2. Select model (e.g., gpt-4o)
  3. Enter deployment name (e.g., gpt-4o)
  4. Select model version
  5. Set tokens-per-minute quota
  6. Click Create

Step 3: Configure Deployment Mapping

If your deployment name differs from the model name, add a mapping:

{
  "resource_name": "my-azure-ai",
  "api_key": "xxx",
  "deployment_mappings": {
    "gpt-4o": "my-gpt4o-deployment"
  }
}

Deploying Meta Llama Models

Step 1: Access Model Catalog

  1. Go to Azure AI Studio
  2. Navigate to Model Catalog
  3. Search for "Llama"

Step 2: Deploy Model

  1. Click on your desired model (e.g., Meta-Llama-3.1-70B-Instruct)
  2. Click Deploy > Serverless API
  3. Review pricing information
  4. Accept Meta's license agreement
  5. Click Deploy

Step 3: Verify Deployment

Test with a direct API call:

curl -X POST \
  "https://my-azure-ai.services.ai.azure.com/models/chat/completions" \
  -H "api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Meta-Llama-3.1-70B-Instruct",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 100
  }'

Deploying Mistral AI Models

Step 1: Access Model Catalog

  1. Go to Azure AI Studio
  2. Navigate to Model Catalog
  3. Search for "Mistral"

Step 2: Deploy Model

  1. Click on your desired model (e.g., Mistral-Large-2407)
  2. Click Deploy > Serverless API
  3. Review pricing information
  4. Accept Mistral's terms
  5. Click Deploy

Deploying DeepSeek Models

Step 1: Access Model Catalog

  1. Go to Azure AI Studio
  2. Navigate to Model Catalog
  3. Search for "DeepSeek"

Step 2: Deploy Model

  1. Click on your desired model (e.g., DeepSeek-R1)
  2. Click Deploy > Serverless API
  3. Review pricing information
  4. Accept DeepSeek's terms
  5. Click Deploy

Making Requests

Python Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.bastio.ai/v1/guard/{PROXY_ID}/v1",
    api_key="your-bastio-api-key"
)

# Using GPT-4o
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Explain quantum computing"}
    ]
)

print(response.choices[0].message.content)

JavaScript Example

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.bastio.ai/v1/guard/{PROXY_ID}/v1',
  apiKey: process.env.BASTIO_API_KEY,
});

// Using Llama via Azure
const response = await client.chat.completions.create({
  model: 'Meta-Llama-3.1-70B-Instruct',
  messages: [
    { role: 'user', content: 'Write a haiku about AI' }
  ],
});

console.log(response.choices[0].message.content);

Streaming Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.bastio.ai/v1/guard/{PROXY_ID}/v1",
    api_key="your-bastio-api-key"
)

# Streaming with Mistral
stream = client.chat.completions.create(
    model="Mistral-Large-2407",
    messages=[{"role": "user", "content": "Write a short story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Using Different Providers

Same proxy, different models - Bastio routes automatically:

# OpenAI GPT-4o
client.chat.completions.create(model="gpt-4o", ...)

# Meta Llama
client.chat.completions.create(model="Meta-Llama-3.1-70B-Instruct", ...)

# Mistral AI
client.chat.completions.create(model="Mistral-Large-2407", ...)

# DeepSeek
client.chat.completions.create(model="DeepSeek-R1", ...)

# Microsoft Phi
client.chat.completions.create(model="Phi-4", ...)

Model Routing

Bastio automatically routes requests based on the model name. All models use the unified Azure AI Inference API with OpenAI-compatible format.

Automatic Routing

  • OpenAI models (gpt-*, o1-*): Routed to Azure OpenAI endpoint
  • Llama models (Meta-Llama-*, Llama-*): Routed to Model Catalog endpoint
  • Mistral models (Mistral-*, Codestral-*): Routed to Model Catalog endpoint
  • DeepSeek models (DeepSeek-*): Routed to Model Catalog endpoint
  • Microsoft models (Phi-*): Routed to Model Catalog endpoint

Deployment Name Mappings

For Azure OpenAI deployments with custom names, use deployment_mappings:

{
  "resource_name": "my-azure-ai",
  "api_key": "xxx",
  "deployment_mappings": {
    "gpt-4o": "production-gpt4o",
    "gpt-4o-mini": "production-gpt4o-mini"
  }
}

When you request gpt-4o, Bastio will route to the production-gpt4o deployment.

Credential Format

Full Credential Structure

{
  "resource_name": "your-azure-resource",
  "api_key": "your-api-key",
  "api_version": "2024-10-21",
  "endpoint_type": "inference",
  "deployment_mappings": {
    "gpt-4o": "my-gpt4o-deployment",
    "gpt-4o-mini": "my-gpt4o-mini-deployment"
  }
}

Required Fields

  • resource_name: Your Azure AI resource name (the prefix of your endpoint URL)
  • api_key: API key from Azure Portal (Keys and Endpoint section)

Optional Fields

  • api_version: Azure API version (default: 2024-10-21)
  • endpoint_type: inference (unified Model Inference API) or openai (Azure OpenAI API)
  • deployment_mappings: Map model names to deployment names

Endpoint Types

Inference Endpoint (Recommended)

  • URL: https://{resource}.services.ai.azure.com/models/chat/completions
  • Works with all models (OpenAI + Model Catalog)
  • Model specified in request body
  • Simpler configuration

OpenAI Endpoint

  • URL: https://{resource}.openai.azure.com/openai/deployments/{deployment}/chat/completions
  • Per-deployment URLs
  • Requires deployment mappings
  • Compatible with existing Azure OpenAI customers

Pricing & Cost Tracking

Pricing Comparison

Azure AI Foundry pricing is generally identical to direct provider pricing:

ProviderModelDirectAzure
OpenAIGPT-4o$2.50/$10$2.50/$10
OpenAIGPT-4o Mini$0.15/$0.60$0.15/$0.60
MetaLlama 3.1 70B$2.68/$3.54$2.68/$3.54
MistralMistral Large$2/$6$2/$6
DeepSeekDeepSeek R1$0.55/$2.19$0.55/$2.19

Prices per 1M tokens (input/output)

Cost Tracking Features

Bastio automatically tracks costs across all Azure AI models:

  • Dashboard - Real-time spending across all models
  • Analytics - Historical cost analysis by model, user, time
  • Billing - Detailed breakdowns by provider and model
  • Alerts - Set spending limits and notifications

Troubleshooting

Permission Denied

Error: AuthenticationError or Access denied

Solutions:

  1. Verify API key is correct (copy fresh from Azure Portal)
  2. Check API key hasn't been regenerated
  3. Ensure the key has access to all required deployments
  4. Test key directly with Azure endpoint

Quota Exceeded

Error: RateLimitError or Quota exceeded

Solutions:

  1. Go to Quotas in Azure Portal
  2. Request quota increase for the model
  3. Consider deploying in additional regions
  4. Use a different model temporarily

Model Not Found

Error: Model not found or Deployment not found

Solutions:

  1. Verify model is deployed in Azure AI Studio
  2. Check deployment name matches model name (or use deployment_mappings)
  3. Confirm deployment is in the same resource as your API key
  4. For Model Catalog models, ensure terms are accepted

Credential Issues

Error: Invalid credentials or Could not parse credentials

Solutions:

  1. Verify JSON is valid (no trailing commas, proper quotes)
  2. Check resource_name matches your Azure resource
  3. Ensure api_key is a valid Azure API key
  4. Test credentials with curl first

Region Availability

Error: Region not available or endpoint errors

Solutions:

  1. Check model availability in your region
  2. Model Catalog models may have limited regional availability
  3. Consider East US or West Europe for best availability
  4. OpenAI models generally have wider availability

Frequently Asked Questions

Q: Can I use both Azure AI and direct provider APIs?

A: Yes! Create separate proxies for each. For example, have an Azure AI proxy for enterprise workloads and direct OpenAI proxy for development.

Q: Does streaming work for all models?

A: Yes, streaming is fully supported for all Azure AI models including OpenAI, Llama, Mistral, DeepSeek, and Microsoft models.

Q: What's the difference between Azure OpenAI and Azure AI Foundry?

A: Azure OpenAI provides only OpenAI models. Azure AI Foundry (via Model Catalog) provides access to Llama, Mistral, DeepSeek, and Microsoft models in addition to OpenAI. Both can be accessed through a single Bastio proxy.

Q: Do I need separate deployments for each model?

A: For Azure OpenAI models (GPT-4o, etc.), yes - you need to deploy each model. For Model Catalog models (Llama, Mistral), the serverless API handles this automatically.

Q: How do deployment mappings work?

A: If your Azure OpenAI deployment name differs from the model name (e.g., deployment prod-gpt4 for model gpt-4o), add a mapping in your credentials. This tells Bastio which deployment to use for each model name.

Q: Can I use Azure's content filtering?

A: Yes, Azure's built-in content filtering applies on top of Bastio's security features, giving you multiple layers of protection.

Q: Which models support vision/images?

A: GPT-4o, GPT-4o Mini, GPT-4 Turbo, and Llama 3.2 Vision models all support image inputs.

When to Use Azure AI Foundry

Choose Azure AI Foundry if you:

  • Already have Azure infrastructure
  • Need Azure compliance certifications (SOC 2, HIPAA, ISO)
  • Want consolidated billing through Azure
  • Need VNet integration or private endpoints
  • Want to access multiple AI providers with one credential
  • Have Azure enterprise agreements
  • Prefer simple API key authentication

Choose Direct Providers if you:

  • Want the simplest possible setup
  • Don't have an Azure account
  • Need the absolute latest model features immediately
  • Prefer direct vendor relationships
  • Have existing provider API keys

Additional Resources


Need help? Contact support@bastio.ai or visit our support page.