Azure AI Foundry Integration
Complete guide to using Azure AI Foundry with Bastio for OpenAI, Llama, Mistral, DeepSeek, and Microsoft models.
Azure AI Foundry
Access OpenAI, Meta Llama, Mistral AI, DeepSeek, and Microsoft models through a single Azure credential with Bastio's full security protection.
Overview
Azure AI Foundry (formerly Azure AI) provides a unified gateway to multiple AI providers through Microsoft Azure's infrastructure. With Bastio, you can:
- One credential, five vendors - Access OpenAI, Meta, Mistral, DeepSeek, and Microsoft models with a single Azure API key
- Enterprise-grade security - Azure compliance certifications, VNet integration, private endpoints
- Unified billing - All usage consolidated in your Azure subscription
- Full security coverage - All Bastio security features work across all providers
- OpenAI-compatible API - Same API format for all models, no code changes needed
Why Azure AI Foundry?
Unlike direct provider integrations, Azure AI Foundry offers unique advantages:
| Feature | Direct Providers | Azure AI Foundry |
|---|---|---|
| Credentials needed | 5 separate API keys | 1 Azure API key |
| Billing | 5 separate invoices | 1 Azure invoice |
| Compliance | Varies by provider | Azure certifications (SOC, HIPAA, ISO) |
| Network security | Public internet | VNet, private endpoints |
| API format | Different per provider | Unified OpenAI-compatible |
| Model access | Individual agreements | Model Catalog marketplace |
Supported Models
OpenAI Models (via Azure OpenAI)
Native OpenAI models available through Azure OpenAI Service:
GPT-4o Family
| Model | Context | Max Output | Input Price | Output Price | Vision | Tools |
|---|---|---|---|---|---|---|
gpt-4o | 128K tokens | 16K tokens | $2.50/1M | $10.00/1M | Yes | Yes |
gpt-4o-2024-11-20 | 128K tokens | 16K tokens | $2.50/1M | $10.00/1M | Yes | Yes |
gpt-4o-mini | 128K tokens | 16K tokens | $0.15/1M | $0.60/1M | Yes | Yes |
GPT-4 Turbo
| Model | Context | Max Output | Input Price | Output Price | Vision | Tools |
|---|---|---|---|---|---|---|
gpt-4-turbo | 128K tokens | 4K tokens | $10.00/1M | $30.00/1M | Yes | Yes |
GPT-4 Base
| Model | Context | Max Output | Input Price | Output Price | Vision | Tools |
|---|---|---|---|---|---|---|
gpt-4 | 8K tokens | 8K tokens | $30.00/1M | $60.00/1M | No | Yes |
gpt-4-32k | 32K tokens | 32K tokens | $60.00/1M | $120.00/1M | No | Yes |
GPT-3.5 Turbo
| Model | Context | Max Output | Input Price | Output Price |
|---|---|---|---|---|
gpt-35-turbo | 16K tokens | 4K tokens | $0.50/1M | $1.50/1M |
gpt-35-turbo-16k | 16K tokens | 16K tokens | $3.00/1M | $4.00/1M |
o1 Reasoning Models
| Model | Context | Max Output | Input Price | Output Price |
|---|---|---|---|---|
o1-preview | 128K tokens | 32K tokens | $15.00/1M | $60.00/1M |
o1-mini | 128K tokens | 65K tokens | $3.00/1M | $12.00/1M |
Meta Llama Models (via Model Catalog)
| Model | Context | Max Output | Input Price | Output Price | Vision | Tools |
|---|---|---|---|---|---|---|
Meta-Llama-3.1-405B-Instruct | 128K tokens | 4K tokens | $5.33/1M | $16.00/1M | No | Yes |
Meta-Llama-3.1-70B-Instruct | 128K tokens | 4K tokens | $2.68/1M | $3.54/1M | No | Yes |
Meta-Llama-3.1-8B-Instruct | 128K tokens | 4K tokens | $0.30/1M | $0.61/1M | No | Yes |
Llama-3.2-90B-Vision-Instruct | 128K tokens | 4K tokens | $2.00/1M | $2.00/1M | Yes | No |
Llama-3.2-11B-Vision-Instruct | 128K tokens | 4K tokens | $0.37/1M | $0.37/1M | Yes | No |
Mistral AI Models (via Model Catalog)
| Model | Context | Max Output | Input Price | Output Price | Tools |
|---|---|---|---|---|---|
Mistral-Large-2407 | 128K tokens | 8K tokens | $2.00/1M | $6.00/1M | Yes |
Mistral-Small | 128K tokens | 8K tokens | $1.00/1M | $3.00/1M | Yes |
Codestral-2405 | 32K tokens | 8K tokens | $1.00/1M | $3.00/1M | No |
Mistral-Nemo | 128K tokens | 8K tokens | $0.30/1M | $0.30/1M | Yes |
DeepSeek Models (via Model Catalog)
| Model | Context | Max Output | Input Price | Output Price |
|---|---|---|---|---|
DeepSeek-R1 | 64K tokens | 8K tokens | $0.55/1M | $2.19/1M |
DeepSeek-V3-0324 | 64K tokens | 8K tokens | $0.27/1M | $1.10/1M |
Microsoft Models
| Model | Context | Max Output | Input Price | Output Price |
|---|---|---|---|---|
Phi-4 | 16K tokens | 4K tokens | $0.125/1M | $0.50/1M |
Phi-3.5-mini-instruct | 128K tokens | 4K tokens | $0.13/1M | $0.52/1M |
Embedding Models
| Model | Dimensions | Input Price |
|---|---|---|
text-embedding-3-large | 3072 | $0.13/1M |
text-embedding-3-small | 1536 | $0.02/1M |
text-embedding-ada-002 | 1536 | $0.10/1M |
Quick Start
Prerequisites
- Azure account with active subscription
- Azure AI Hub or Azure OpenAI resource
- Models deployed in your project
- API key from Azure Portal
Step 1: Create Azure AI Resource
- Go to the Azure Portal
- Click Create a resource > AI + Machine Learning
- Choose Azure AI Hub (for all models) or Azure OpenAI (for OpenAI models only)
- Select your subscription and resource group
- Choose a region (e.g.,
East US,West Europe) - Click Review + create > Create
Step 2: Deploy Models
For Azure OpenAI Models:
- Go to your Azure OpenAI resource
- Click Model deployments > Create new deployment
- Select the model (e.g.,
gpt-4o) - Name your deployment (e.g.,
gpt-4o-production) - Set tokens-per-minute quota
- Click Create
For Model Catalog Models (Llama, Mistral, DeepSeek):
- Go to Azure AI Studio
- Navigate to Model Catalog
- Search for your model (e.g.,
Llama-3.1-70B) - Click Deploy > Serverless API
- Accept terms and conditions
- Click Deploy
Step 3: Get API Credentials
- Go to your Azure AI resource in the Portal
- Navigate to Keys and Endpoint
- Copy KEY 1 or KEY 2
- Copy the Endpoint URL
- Note your resource name (the first part of the endpoint URL)
Step 4: Configure in Bastio
- Go to Dashboard > Proxies > Create New Proxy
- Select Azure AI Foundry as provider
- Enter your credentials as JSON (see Credential Format below)
- Click Create Proxy
BYOK Mode (Bring Your Own Key)
Use your own Azure credentials with Bastio.
Via Dashboard
- Go to Dashboard > Proxies > Create New Proxy
- Select Azure AI Foundry as provider
- Choose Your API Keys (BYOK) mode
- Enter your Azure credentials as JSON:
{
"resource_name": "my-azure-ai",
"api_key": "your-api-key-here",
"api_version": "2024-10-21",
"endpoint_type": "inference"
}- Click Create Proxy
Via API
# Create Azure AI Foundry proxy
curl -X POST https://api.bastio.ai/proxy \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Production Azure AI",
"provider": "azure",
"llm_mode": "byok",
"model_behavior": "passthrough"
}'
# Add Azure credentials
curl -X POST https://api.bastio.ai/keys/provider \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"provider": "azure",
"key_name": "Azure Production",
"api_key": "{\"resource_name\":\"my-azure-ai\",\"api_key\":\"xxx\",\"api_version\":\"2024-10-21\",\"endpoint_type\":\"inference\"}"
}'Model Deployment Guides
Deploying OpenAI Models
Step 1: Access Azure OpenAI
- Go to Azure Portal
- Navigate to your Azure OpenAI resource
- Click Go to Azure OpenAI Studio
Step 2: Create Deployment
- Click Deployments > Create new deployment
- Select model (e.g.,
gpt-4o) - Enter deployment name (e.g.,
gpt-4o) - Select model version
- Set tokens-per-minute quota
- Click Create
Step 3: Configure Deployment Mapping
If your deployment name differs from the model name, add a mapping:
{
"resource_name": "my-azure-ai",
"api_key": "xxx",
"deployment_mappings": {
"gpt-4o": "my-gpt4o-deployment"
}
}Deploying Meta Llama Models
Step 1: Access Model Catalog
- Go to Azure AI Studio
- Navigate to Model Catalog
- Search for "Llama"
Step 2: Deploy Model
- Click on your desired model (e.g.,
Meta-Llama-3.1-70B-Instruct) - Click Deploy > Serverless API
- Review pricing information
- Accept Meta's license agreement
- Click Deploy
Step 3: Verify Deployment
Test with a direct API call:
curl -X POST \
"https://my-azure-ai.services.ai.azure.com/models/chat/completions" \
-H "api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Meta-Llama-3.1-70B-Instruct",
"messages": [{"role": "user", "content": "Hello"}],
"max_tokens": 100
}'Deploying Mistral AI Models
Step 1: Access Model Catalog
- Go to Azure AI Studio
- Navigate to Model Catalog
- Search for "Mistral"
Step 2: Deploy Model
- Click on your desired model (e.g.,
Mistral-Large-2407) - Click Deploy > Serverless API
- Review pricing information
- Accept Mistral's terms
- Click Deploy
Deploying DeepSeek Models
Step 1: Access Model Catalog
- Go to Azure AI Studio
- Navigate to Model Catalog
- Search for "DeepSeek"
Step 2: Deploy Model
- Click on your desired model (e.g.,
DeepSeek-R1) - Click Deploy > Serverless API
- Review pricing information
- Accept DeepSeek's terms
- Click Deploy
Making Requests
Python Example
from openai import OpenAI
client = OpenAI(
base_url="https://api.bastio.ai/v1/guard/{PROXY_ID}/v1",
api_key="your-bastio-api-key"
)
# Using GPT-4o
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Explain quantum computing"}
]
)
print(response.choices[0].message.content)JavaScript Example
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.bastio.ai/v1/guard/{PROXY_ID}/v1',
apiKey: process.env.BASTIO_API_KEY,
});
// Using Llama via Azure
const response = await client.chat.completions.create({
model: 'Meta-Llama-3.1-70B-Instruct',
messages: [
{ role: 'user', content: 'Write a haiku about AI' }
],
});
console.log(response.choices[0].message.content);Streaming Example
from openai import OpenAI
client = OpenAI(
base_url="https://api.bastio.ai/v1/guard/{PROXY_ID}/v1",
api_key="your-bastio-api-key"
)
# Streaming with Mistral
stream = client.chat.completions.create(
model="Mistral-Large-2407",
messages=[{"role": "user", "content": "Write a short story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Using Different Providers
Same proxy, different models - Bastio routes automatically:
# OpenAI GPT-4o
client.chat.completions.create(model="gpt-4o", ...)
# Meta Llama
client.chat.completions.create(model="Meta-Llama-3.1-70B-Instruct", ...)
# Mistral AI
client.chat.completions.create(model="Mistral-Large-2407", ...)
# DeepSeek
client.chat.completions.create(model="DeepSeek-R1", ...)
# Microsoft Phi
client.chat.completions.create(model="Phi-4", ...)Model Routing
Bastio automatically routes requests based on the model name. All models use the unified Azure AI Inference API with OpenAI-compatible format.
Automatic Routing
- OpenAI models (
gpt-*,o1-*): Routed to Azure OpenAI endpoint - Llama models (
Meta-Llama-*,Llama-*): Routed to Model Catalog endpoint - Mistral models (
Mistral-*,Codestral-*): Routed to Model Catalog endpoint - DeepSeek models (
DeepSeek-*): Routed to Model Catalog endpoint - Microsoft models (
Phi-*): Routed to Model Catalog endpoint
Deployment Name Mappings
For Azure OpenAI deployments with custom names, use deployment_mappings:
{
"resource_name": "my-azure-ai",
"api_key": "xxx",
"deployment_mappings": {
"gpt-4o": "production-gpt4o",
"gpt-4o-mini": "production-gpt4o-mini"
}
}When you request gpt-4o, Bastio will route to the production-gpt4o deployment.
Credential Format
Full Credential Structure
{
"resource_name": "your-azure-resource",
"api_key": "your-api-key",
"api_version": "2024-10-21",
"endpoint_type": "inference",
"deployment_mappings": {
"gpt-4o": "my-gpt4o-deployment",
"gpt-4o-mini": "my-gpt4o-mini-deployment"
}
}Required Fields
resource_name: Your Azure AI resource name (the prefix of your endpoint URL)api_key: API key from Azure Portal (Keys and Endpoint section)
Optional Fields
api_version: Azure API version (default:2024-10-21)endpoint_type:inference(unified Model Inference API) oropenai(Azure OpenAI API)deployment_mappings: Map model names to deployment names
Endpoint Types
Inference Endpoint (Recommended)
- URL:
https://{resource}.services.ai.azure.com/models/chat/completions - Works with all models (OpenAI + Model Catalog)
- Model specified in request body
- Simpler configuration
OpenAI Endpoint
- URL:
https://{resource}.openai.azure.com/openai/deployments/{deployment}/chat/completions - Per-deployment URLs
- Requires deployment mappings
- Compatible with existing Azure OpenAI customers
Pricing & Cost Tracking
Pricing Comparison
Azure AI Foundry pricing is generally identical to direct provider pricing:
| Provider | Model | Direct | Azure |
|---|---|---|---|
| OpenAI | GPT-4o | $2.50/$10 | $2.50/$10 |
| OpenAI | GPT-4o Mini | $0.15/$0.60 | $0.15/$0.60 |
| Meta | Llama 3.1 70B | $2.68/$3.54 | $2.68/$3.54 |
| Mistral | Mistral Large | $2/$6 | $2/$6 |
| DeepSeek | DeepSeek R1 | $0.55/$2.19 | $0.55/$2.19 |
Prices per 1M tokens (input/output)
Cost Tracking Features
Bastio automatically tracks costs across all Azure AI models:
- Dashboard - Real-time spending across all models
- Analytics - Historical cost analysis by model, user, time
- Billing - Detailed breakdowns by provider and model
- Alerts - Set spending limits and notifications
Troubleshooting
Permission Denied
Error: AuthenticationError or Access denied
Solutions:
- Verify API key is correct (copy fresh from Azure Portal)
- Check API key hasn't been regenerated
- Ensure the key has access to all required deployments
- Test key directly with Azure endpoint
Quota Exceeded
Error: RateLimitError or Quota exceeded
Solutions:
- Go to Quotas in Azure Portal
- Request quota increase for the model
- Consider deploying in additional regions
- Use a different model temporarily
Model Not Found
Error: Model not found or Deployment not found
Solutions:
- Verify model is deployed in Azure AI Studio
- Check deployment name matches model name (or use
deployment_mappings) - Confirm deployment is in the same resource as your API key
- For Model Catalog models, ensure terms are accepted
Credential Issues
Error: Invalid credentials or Could not parse credentials
Solutions:
- Verify JSON is valid (no trailing commas, proper quotes)
- Check
resource_namematches your Azure resource - Ensure
api_keyis a valid Azure API key - Test credentials with curl first
Region Availability
Error: Region not available or endpoint errors
Solutions:
- Check model availability in your region
- Model Catalog models may have limited regional availability
- Consider
East USorWest Europefor best availability - OpenAI models generally have wider availability
Frequently Asked Questions
Q: Can I use both Azure AI and direct provider APIs?
A: Yes! Create separate proxies for each. For example, have an Azure AI proxy for enterprise workloads and direct OpenAI proxy for development.
Q: Does streaming work for all models?
A: Yes, streaming is fully supported for all Azure AI models including OpenAI, Llama, Mistral, DeepSeek, and Microsoft models.
Q: What's the difference between Azure OpenAI and Azure AI Foundry?
A: Azure OpenAI provides only OpenAI models. Azure AI Foundry (via Model Catalog) provides access to Llama, Mistral, DeepSeek, and Microsoft models in addition to OpenAI. Both can be accessed through a single Bastio proxy.
Q: Do I need separate deployments for each model?
A: For Azure OpenAI models (GPT-4o, etc.), yes - you need to deploy each model. For Model Catalog models (Llama, Mistral), the serverless API handles this automatically.
Q: How do deployment mappings work?
A: If your Azure OpenAI deployment name differs from the model name (e.g., deployment prod-gpt4 for model gpt-4o), add a mapping in your credentials. This tells Bastio which deployment to use for each model name.
Q: Can I use Azure's content filtering?
A: Yes, Azure's built-in content filtering applies on top of Bastio's security features, giving you multiple layers of protection.
Q: Which models support vision/images?
A: GPT-4o, GPT-4o Mini, GPT-4 Turbo, and Llama 3.2 Vision models all support image inputs.
When to Use Azure AI Foundry
Choose Azure AI Foundry if you:
- Already have Azure infrastructure
- Need Azure compliance certifications (SOC 2, HIPAA, ISO)
- Want consolidated billing through Azure
- Need VNet integration or private endpoints
- Want to access multiple AI providers with one credential
- Have Azure enterprise agreements
- Prefer simple API key authentication
Choose Direct Providers if you:
- Want the simplest possible setup
- Don't have an Azure account
- Need the absolute latest model features immediately
- Prefer direct vendor relationships
- Have existing provider API keys
Additional Resources
- Azure AI Foundry Documentation
- Azure AI Model Catalog
- Azure OpenAI Service
- Azure AI Pricing
- Azure Portal
- Bastio Support
Need help? Contact support@bastio.ai or visit our support page.