Memory System
Enable long-term memory for your AI agents to persist context across sessions.
Memory System
The Memory System allows your Bastio proxies to store and retrieve past interactions, providing your AI agents with long-term memory. This enables context-aware conversations that persist across different sessions.
Overview
When enabled, the Memory System:
- Stores user interactions (prompts and completions) in a vector database.
- Retrieves relevant past interactions based on the semantic similarity to the current user prompt.
- Injects this context into the system prompt of the current request, allowing the LLM to "remember" previous details.
Configuration
You can configure memory settings for each proxy individually.
Enabling Memory
- Go to your Proxy Configuration in the Bastio Dashboard.
- Navigate to the Memory section.
- Toggle Enable Memory.
- Select the Memory Strategy (currently "Semantic" is supported).
Auto-Generate User ID
By default, the memory system requires a user_id to be passed in the API request to associate memories with a specific user.
If you want to enable memory without managing user IDs manually, you can enable Auto-generate User ID.
- In the Memory section of your Proxy Configuration.
- Toggle Auto-generate User ID.
When enabled, if a request arrives without a user_id, Bastio will generate a consistent, anonymous ID based on the request fingerprint (IP address, User Agent, etc.).
API Usage
With Explicit User ID
To use memory with a specific user, pass the user field in your API request:
const response = await client.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'My favorite color is blue.' }],
user: 'user_12345' // Unique ID for the user
});Subsequent requests with the same user ID will have access to the context established in previous turns.
With Auto-Generated User ID
If Auto-generate User ID is enabled in your proxy settings, you can simply omit the user field:
const response = await client.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'My favorite color is blue.' }]
});Bastio will automatically assign a stable ID to this client based on their request fingerprint.
How it Works
Storage
- Embedding: User prompts and assistant responses are processed using high-performance embedding models.
- Secure Storage: These embeddings are stored in our secure, high-performance database infrastructure.
Retrieval
- Semantic Search: When a new request comes in, the system analyzes the semantic meaning of the current prompt.
- Context Injection: It identifies the most relevant past interactions for that user and seamlessly injects them into the context window.
Privacy & Security
- Isolation: Memories are strictly isolated by
proxy_idanduser_id. One user's memories are never accessible to another. - Encryption: All memory data is stored encrypted at rest.