DeepSeek Integration
Complete guide to using DeepSeek with Bastio for DeepSeek-V3 chat and reasoning models.
DeepSeek
Access DeepSeek's powerful AI models with advanced reasoning capabilities through Bastio's security layer.
Overview
DeepSeek offers state-of-the-art AI models with exceptional performance and competitive pricing. With Bastio, you can:
- Advanced reasoning - Access DeepSeek Reasoner with thinking mode for complex problem-solving
- Cost-effective - Industry-leading pricing at $0.28/1M input tokens
- OpenAI-compatible - Drop-in replacement API, no code changes needed
- Full security coverage - All Bastio security features work seamlessly
- Cache savings - 90% discount on cached prompts ($0.028/1M)
Why DeepSeek?
DeepSeek provides exceptional value with advanced capabilities:
| Feature | DeepSeek | GPT-4o | Claude Sonnet |
|---|---|---|---|
| Input Price (1M tokens) | $0.28 | $2.50 | $3.00 |
| Output Price (1M tokens) | $0.42 | $10.00 | $15.00 |
| Cache Hit Price | $0.028 | N/A | $0.30 |
| Context Window | 128K | 128K | 200K |
| Reasoning Mode | Yes | No | No |
| Tool Calling | Yes | Yes | Yes |
Supported Models
DeepSeek Chat
General-purpose chat model using DeepSeek-V3.2:
| Model | Context | Max Output | Input Price | Output Price | Tools | JSON |
|---|---|---|---|---|---|---|
deepseek-chat | 128K tokens | 8K tokens | $0.28/1M | $0.42/1M | Yes | Yes |
Best for: General chat, code generation, analysis, and everyday tasks.
DeepSeek Reasoner
Advanced reasoning model with thinking mode:
| Model | Context | Max Output | Input Price | Output Price | Reasoning | Tools |
|---|---|---|---|---|---|---|
deepseek-reasoner | 128K tokens | 64K tokens | $0.28/1M | $0.42/1M | Yes | Yes |
Best for: Complex problem-solving, math, logic, multi-step reasoning.
Quick Start
Prerequisites
- DeepSeek API key from platform.deepseek.com
- Bastio account
Step 1: Get Your DeepSeek API Key
- Go to platform.deepseek.com
- Sign in or create an account
- Navigate to API Keys
- Click Create new secret key
- Copy your API key
Step 2: Create a Proxy in Bastio
- Go to Dashboard > Proxies > Create New Proxy
- Select DeepSeek as provider
- Choose Your API Keys (BYOK) mode
- Enter your DeepSeek API key
- Click Create Proxy
Step 3: Start Making Requests
from openai import OpenAI
client = OpenAI(
base_url="https://api.bastio.com/v1/guard/{PROXY_ID}/v1",
api_key="your-bastio-api-key"
)
response = client.chat.completions.create(
model="deepseek-chat",
messages=[
{"role": "user", "content": "Explain quantum computing"}
]
)
print(response.choices[0].message.content)BYOK Mode (Bring Your Own Key)
DeepSeek integration is BYOK-only. Use your own DeepSeek API key with Bastio's security layer.
Via Dashboard
- Go to Dashboard > Proxies > Create New Proxy
- Select DeepSeek as provider
- Choose Your API Keys (BYOK) mode
- Enter your DeepSeek API key
- Select a default model (optional)
- Click Create Proxy
Via API
# Create DeepSeek proxy
curl -X POST https://api.bastio.com/proxy \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Production DeepSeek",
"provider": "deepseek",
"llm_mode": "byok",
"model_behavior": "passthrough"
}'
# Add DeepSeek API key
curl -X POST https://api.bastio.com/keys/provider \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"provider": "deepseek",
"key_name": "DeepSeek Production",
"api_key": "sk-your-deepseek-key"
}'Code Examples
Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
base_url="https://api.bastio.com/v1/guard/{PROXY_ID}/v1",
api_key="your-bastio-api-key"
)
# Using DeepSeek Chat
response = client.chat.completions.create(
model="deepseek-chat",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Write a Python function to sort a list"}
],
temperature=0.7
)
print(response.choices[0].message.content)JavaScript/TypeScript
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.bastio.com/v1/guard/{PROXY_ID}/v1',
apiKey: process.env.BASTIO_API_KEY,
});
// Using DeepSeek Reasoner for complex problems
const response = await client.chat.completions.create({
model: 'deepseek-reasoner',
messages: [
{ role: 'user', content: 'Solve this step by step: If 3x + 5 = 17, find x' }
],
});
console.log(response.choices[0].message.content);Streaming
from openai import OpenAI
client = OpenAI(
base_url="https://api.bastio.com/v1/guard/{PROXY_ID}/v1",
api_key="your-bastio-api-key"
)
stream = client.chat.completions.create(
model="deepseek-chat",
messages=[
{"role": "user", "content": "Write a short story about AI"}
],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Tool/Function Calling
from openai import OpenAI
import json
client = OpenAI(
base_url="https://api.bastio.com/v1/guard/{PROXY_ID}/v1",
api_key="your-bastio-api-key"
)
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name"
}
},
"required": ["location"]
}
}
}
]
response = client.chat.completions.create(
model="deepseek-chat",
messages=[
{"role": "user", "content": "What's the weather in San Francisco?"}
],
tools=tools,
tool_choice="auto"
)
# Handle tool calls
if response.choices[0].message.tool_calls:
for tool_call in response.choices[0].message.tool_calls:
print(f"Function: {tool_call.function.name}")
print(f"Arguments: {tool_call.function.arguments}")JSON Mode
from openai import OpenAI
client = OpenAI(
base_url="https://api.bastio.com/v1/guard/{PROXY_ID}/v1",
api_key="your-bastio-api-key"
)
response = client.chat.completions.create(
model="deepseek-chat",
messages=[
{"role": "system", "content": "Respond in JSON format only."},
{"role": "user", "content": "List 3 programming languages with their use cases"}
],
response_format={"type": "json_object"}
)
import json
data = json.loads(response.choices[0].message.content)
print(data)Using DeepSeek Reasoner
DeepSeek Reasoner includes a "thinking mode" for complex reasoning tasks. The model shows its reasoning process before providing the final answer.
When to Use Reasoner
- Math problems - Step-by-step calculations
- Logic puzzles - Deductive reasoning
- Code debugging - Systematic analysis
- Complex analysis - Multi-factor decision making
- Research - Hypothesis testing and evaluation
Reasoner Example
from openai import OpenAI
client = OpenAI(
base_url="https://api.bastio.com/v1/guard/{PROXY_ID}/v1",
api_key="your-bastio-api-key"
)
# Reasoner excels at complex problems
response = client.chat.completions.create(
model="deepseek-reasoner",
messages=[
{"role": "user", "content": """
A farmer has 100 acres. He wants to plant wheat and corn.
- Wheat yields $200/acre profit, needs 3 workers/acre
- Corn yields $300/acre profit, needs 5 workers/acre
- He has 350 workers available
- He must plant at least 20 acres of wheat (contract)
How should he allocate land to maximize profit?
"""}
],
max_tokens=4096
)
print(response.choices[0].message.content)Reasoner vs Chat Comparison
| Task Type | Use Chat | Use Reasoner |
|---|---|---|
| Simple Q&A | Yes | No |
| Code generation | Yes | For complex algorithms |
| Math problems | Simple | Complex multi-step |
| Creative writing | Yes | No |
| Analysis | Basic | In-depth |
| Debugging | Simple | Complex issues |
Pricing & Cost Optimization
Current Pricing
| Component | Price |
|---|---|
| Input tokens | $0.28/1M |
| Input tokens (cache hit) | $0.028/1M |
| Output tokens | $0.42/1M |
Cache Hit Optimization
DeepSeek offers significant savings for repeated prompts:
- Cache miss: $0.28/1M tokens
- Cache hit: $0.028/1M tokens (90% savings!)
Tips for maximizing cache hits:
- Use consistent system prompts
- Batch similar requests
- Structure prompts with static prefix + dynamic suffix
Cost Comparison Example
For 1M input tokens + 500K output tokens:
| Provider | Cost |
|---|---|
| DeepSeek (cache miss) | $0.49 |
| DeepSeek (cache hit) | $0.24 |
| GPT-4o | $7.50 |
| Claude 3.5 Sonnet | $10.50 |
Troubleshooting
Invalid API Key
Error: Authentication error or Invalid API key
Solutions:
- Verify API key is correct (starts with
sk-) - Check API key hasn't been revoked
- Ensure you have credits in your DeepSeek account
- Test key directly with DeepSeek API
Rate Limiting
Error: Rate limit exceeded
Solutions:
- Reduce request frequency
- Implement exponential backoff
- Check your tier limits on platform.deepseek.com
- Consider upgrading your DeepSeek account
Model Not Found
Error: Model not found
Solutions:
- Use correct model names:
deepseek-chatordeepseek-reasoner - Check for typos in model name
- Ensure model is available in your region
Context Length Exceeded
Error: Context length exceeded
Solutions:
- Both models support 128K tokens
- Reduce prompt/conversation length
- Summarize earlier messages
- Use
max_tokensto limit response length
Streaming Issues
Error: Chunks not received or incomplete
Solutions:
- Ensure
stream=Trueis set - Check network connectivity
- Implement timeout handling
- Verify server supports SSE
Best Practices
1. Model Selection
- Use
deepseek-chatfor general tasks - Use
deepseek-reasoneronly for complex reasoning - Don't use Reasoner for simple Q&A (wasteful)
2. Prompt Engineering
- Be clear and specific
- Use system prompts for consistent behavior
- Structure complex prompts logically
- Provide examples for desired output format
3. Cost Management
- Monitor usage in Bastio dashboard
- Set up spending alerts
- Optimize prompts to reduce tokens
- Leverage cache hits with consistent prefixes
4. Error Handling
- Implement retry logic with backoff
- Handle rate limits gracefully
- Log errors for debugging
- Set appropriate timeouts
API Reference
Supported Endpoints
| Endpoint | Supported |
|---|---|
/v1/chat/completions | Yes |
/v1/models | Yes |
/v1/embeddings | No |
Request Parameters
| Parameter | Supported | Notes |
|---|---|---|
model | Yes | deepseek-chat or deepseek-reasoner |
messages | Yes | Standard chat format |
temperature | Yes | 0-2 |
top_p | Yes | 0-1 |
max_tokens | Yes | Up to 8K (chat) or 64K (reasoner) |
stream | Yes | SSE format |
tools | Yes | Function calling |
tool_choice | Yes | auto, none, or specific |
response_format | Yes | {"type": "json_object"} |
frequency_penalty | Yes | -2 to 2 |
presence_penalty | Yes | -2 to 2 |
stop | Yes | Up to 4 sequences |
Frequently Asked Questions
Q: What's the difference between DeepSeek Chat and Reasoner?
A: Chat is a fast general-purpose model. Reasoner includes "thinking mode" that shows step-by-step reasoning before answering, making it better for complex problems but slower and using more tokens.
Q: Does DeepSeek support vision/images?
A: No, DeepSeek currently only supports text inputs. For vision tasks, consider using GPT-4o or Claude 3.
Q: How do I get cache hit pricing?
A: Cache hits occur automatically when you send prompts with the same prefix. DeepSeek caches the computation of common prompt prefixes. Use consistent system prompts and templates to maximize cache hits.
Q: Does streaming work with DeepSeek?
A: Yes, both models fully support streaming responses via Server-Sent Events (SSE).
Q: Can I use DeepSeek for production workloads?
A: Yes, DeepSeek offers production-ready APIs with high availability. Check platform.deepseek.com for current SLAs and rate limits.
Q: What languages does DeepSeek support?
A: DeepSeek supports many languages including English, Chinese, and other major languages. Performance may vary by language.
Additional Resources
Need help? Contact hello@bastio.com or visit our support page.