Documentation
Persistent memory API for any LLM. Base URL: https://becomer.net
Authentication
All memory API requests require an API key in the Authorization header.
Authorization: Bearer YOUR_API_KEY
Get your API key from the dashboard after signing up. Keys start with bcm_.
Memory API
POST
/v1/store
Store a memory. Call this when an LLM learns something worth remembering about a user.
| Field | Type | Description |
| content | string | The memory text to store. |
| user_id | string (optional) | Sub-user identifier for multi-tenant apps. Isolates memories per end-user under one API key. |
curl -X POST https://becomer.net/v1/store \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"content": "User prefers dark mode and lives in Bangalore"}'
Response: {"ok": true}
POST
/v1/recall
Recall relevant memories for a query using semantic search.
| Field | Type | Description |
| query | string | What to search for in memory. |
| top_k | integer (optional) | Number of memories to return. Default: 5. |
| user_id | string (optional) | Sub-user identifier. Returns only memories stored under this user_id. |
curl -X POST https://becomer.net/v1/recall \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "user location and preferences", "top_k": 5}'
Response: {"memories": ["User prefers dark mode...", ...]}
POST
/v1/forget
Delete all memories stored under this API key (or just one sub-user's memories if user_id is provided).
| Field | Type | Description |
| forget_all | boolean | Must be true. |
| user_id | string (optional) | If set, only deletes memories for this sub-user. Other sub-users are unaffected. |
curl -X POST https://becomer.net/v1/forget \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"forget_all": true}'
POST
/v1/sync
Consolidate working memory into long-term storage. Call at end of session.
curl -X POST https://becomer.net/v1/sync \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{}'
Multi-tenant
Building a product where each of your users needs isolated memory? Pass user_id — one API key covers your entire user base.
from becomer import Client
# One master key for your whole app
mem = Client("bcm_your-master-key", user_id="user-alice")
mem.store("Alice prefers Python over JavaScript")
# Different user — completely isolated memory space
mem_bob = Client("bcm_your-master-key", user_id="user-bob")
mem_bob.recall("programming preferences") # → [] (cannot see Alice's memories)
# Or override per-call
mem = Client("bcm_your-master-key")
mem.store("Bob uses dark mode", user_id="user-bob")
mem.recall("preferences", user_id="user-alice") # → Alice's memories only
REST API
curl -X POST https://becomer.net/v1/store \
-H "Authorization: Bearer bcm_your-master-key" \
-H "Content-Type: application/json" \
-d '{"content": "Alice prefers dark mode", "user_id": "alice-123"}'
MCP per-user
{
"mcpServers": {
"becomer": {
"command": "python",
"args": ["-m", "becomer"],
"env": {
"BECOMER_API_KEY": "bcm_your-master-key",
"BECOMER_USER_ID": "alice-123"
}
}
}
}
user_id accepts alphanumeric characters plus - _ . @ +, up to 128 characters. Billing counts against the master key.
Python SDK
Zero dependencies. Uses Python's built-in urllib.
pip install becomer
Quickstart
import becomer
mem = becomer.Client("YOUR_API_KEY")
mem.store("User's name is Priya. She's a backend engineer in Chennai.")
results = mem.recall("tell me about the user", top_k=5)
for r in results:
print(r)
mem.sync() # consolidate at end of session
mem.forget() # delete everything
Multi-tenant
mem = becomer.Client("YOUR_API_KEY", user_id="end-user-id")
mem.store("User prefers dark mode") # stored under end-user-id
mem.recall("preferences") # only returns this user's memories
JavaScript / Node.js SDK
Zero dependencies. Node 18+ built-in fetch. TypeScript types included.
npm install @becomerpackage/sdk
Quickstart
const { Client } = require('@becomerpackage/sdk');
// ESM: import { Client } from '@becomerpackage/sdk';
const mem = new Client('bcm_your-api-key');
await mem.store('User prefers dark mode');
const memories = await mem.recall('user preferences');
// → ['User prefers dark mode']
Multi-tenant
const memAlice = new Client('bcm_your-api-key', { userId: 'alice-123' });
await memAlice.store('Alice prefers TypeScript');
// Per-call override
const mem = new Client('bcm_your-api-key');
await mem.store('Bob uses dark mode', { userId: 'bob-456' });
await mem.recall('preferences', { userId: 'alice-123', topK: 5 });
MCP — Claude Code / Claude Desktop
BECOMER ships a built-in MCP server. Install the SDK and add this to your mcp.json.
1. Install
pip install becomer
2. Add to mcp.json
{
"mcpServers": {
"becomer": {
"command": "python",
"args": ["-m", "becomer"],
"env": {
"BECOMER_API_KEY": "YOUR_API_KEY",
"BECOMER_USER_ID": "optional-per-user-id"
}
}
}
}
3. Available tools
store — store a memory from within Claude
recall — recall relevant memories for a query
forget — delete all memories
Once connected, Claude remembers context across sessions — no prompt engineering required.
Framework integrations
BECOMER works with any framework. Below are drop-in patterns for the most common stacks.
LangChain
pip install becomer langchain langchain-openai
from langchain.memory import BaseMemory
from becomer import Client
class BecomerMemory(BaseMemory):
def __init__(self, api_key):
self.mem = Client(api_key)
@property
def memory_variables(self): return ["history"]
def load_memory_variables(self, inputs):
mems = self.mem.recall(inputs.get("input", ""), top_k=5)
return {"history": "\n".join(mems)}
def save_context(self, inputs, outputs):
self.mem.store(f"User: {inputs['input']} | AI: {outputs['output']}")
def clear(self): self.mem.forget()
chain = ConversationChain(llm=ChatOpenAI(model="gpt-4o"), memory=BecomerMemory("YOUR_API_KEY"))
chain.predict(input="What did we discuss last time?")
LlamaIndex
from llama_index.core import Settings
from becomer import Client
mem = Client("YOUR_API_KEY")
def query_with_memory(engine, user_input: str):
context = mem.recall(user_input, top_k=5)
Settings.system_prompt = "User context:\n" + "\n".join(context)
response = engine.query(user_input)
mem.store(f"User asked: {user_input}. Key points: {str(response)[:300]}")
return response
LangGraph
from becomer import Client
mem = Client("YOUR_API_KEY")
def recall_node(state):
state["context"] = mem.recall(state["message"], top_k=5)
return state
def store_node(state):
mem.store(f"User: {state['message']} | AI: {state['response'][:200]}")
return state
CrewAI
from crewai import Agent
from becomer import Client
mem = Client("YOUR_API_KEY")
def make_agent_with_memory(task: str) -> Agent:
context = "\n".join(mem.recall(task, top_k=5))
return Agent(
role="Researcher",
goal="Answer the user's question accurately",
backstory=f"User context:\n{context}" if context else "No prior context.",
)
AutoGen
import autogen
from becomer import Client
mem = Client("YOUR_API_KEY")
def chat(message: str) -> str:
context = "\n".join(mem.recall(message, top_k=5))
system = "You are a helpful assistant."
if context:
system += f"\n\nUser context:\n{context}"
assistant = autogen.AssistantAgent("assistant",
llm_config={"config_list": [{"model": "gpt-4o", "api_key": "sk-..."}]},
system_message=system)
user = autogen.UserProxyAgent("user", human_input_mode="NEVER",
max_consecutive_auto_reply=1, code_execution_config=False)
user.initiate_chat(assistant, message=message)
reply = user.last_message()["content"]
mem.store(f"{message} → {reply[:200]}")
return reply
OpenAI / Anthropic
from openai import OpenAI
from becomer import Client
mem = Client("YOUR_API_KEY")
openai = OpenAI()
def chat(message: str) -> str:
context = "\n".join(mem.recall(message, top_k=5))
response = openai.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": f"What you know about this user:\n{context}"},
{"role": "user", "content": message}
]
)
answer = response.choices[0].message.content
mem.store(f"{message} → {answer}")
return answer
Use cases
BECOMER is a memory primitive. Below are the three patterns that unlock the most value — especially for agent-based and autonomous systems.
Multi-agent shared memory
Multiple agents working on the same task share a single memory namespace via user_id. No message passing, no coordination code, zero tokens on every recall. Each agent — regardless of which LLM powers it — reads and writes to the same store.
from becomer import Client
# Research agent (GPT-4o) — discovers facts
researcher = Client("YOUR_KEY", user_id="task-abc-123")
researcher.store("API endpoint: POST /v2/payments, requires OAuth2 bearer token")
researcher.store("Rate limit: 100 req/min — returns 429 on breach")
researcher.store("Webhook events: payment.success, payment.failed, payment.refunded")
# Executor agent (Claude) — different process, different LLM, same namespace
executor = Client("YOUR_KEY", user_id="task-abc-123")
context = executor.recall("payment API details")
# → researcher's findings, zero message passing needed
# Reviewer agent (Gemini) — audits what was done
reviewer = Client("YOUR_KEY", user_id="task-abc-123")
audit = reviewer.recall("what was implemented and what are the limits?")
Clean up the task namespace when done:
executor.forget() # wipes task-abc-123 namespace only — master key unaffected
Self-improving AI systems
Store every attempt with its outcome. Recall what worked before the next run. Semantic retrieval surfaces similar past attempts even with different phrasing — no structured query language, no schema design.
from becomer import Client
mem = Client("YOUR_KEY", user_id="system-v1")
def run_iteration(approach: str, task: str) -> float:
# Recall best past approaches before running
history = mem.recall(f"what approaches worked best for {task}?", top_k=5)
system_prompt = "Previous learnings:\n" + "\n".join(history) if history else ""
# Run your agent/pipeline here
score = run_agent(task, approach, system_prompt=system_prompt)
# Store the outcome
mem.store(f"Task: {task}. Approach: {approach}. Score: {score:.0%}. "
f"{'Worked well' if score > 0.85 else 'Underperformed'}.")
return score
# System gets smarter every iteration
run_iteration("zero-shot", "code review") # stores 71%
run_iteration("chain-of-thought", "code review") # stores 84%
run_iteration("few-shot + CoT", "code review") # recalls 84% → gets to 91%
Pair with a structured DB for metrics and aggregations — BECOMER handles natural language recall ("what worked?"), your DB handles "show all attempts where score > 85%".
Long-running tasks across sessions
Agents that run for hours, get interrupted, restart days later — and continue exactly where they stopped. No checkpoint files, no manual state serialisation.
from becomer import Client
mem = Client("YOUR_KEY", user_id="project-payments-v2")
# Day 1 — agent works for 2 hours, session ends
def on_interrupt():
mem.store("Completed: steps 1–6, schema designed and reviewed")
mem.store("In progress: step 7 — auth middleware implementation")
mem.store("Blocked: evaluating authlib vs python-jose, need to benchmark both")
mem.sync() # consolidate to long-term storage
# Day 3 — fresh session, full context restored in seconds
status = mem.recall("what step am I on?")
blockers = mem.recall("what am I blocked on?")
done = mem.recall("what has been completed?")
# → ["In progress: step 7 — auth middleware implementation"]
# → ["Blocked: evaluating authlib vs python-jose..."]
# → ["Completed: steps 1–6, schema designed and reviewed"]
# Agent continues from exactly step 7 — no setup, no files
Best practices
How you store and query memories directly affects recall quality. These patterns consistently produce the best results.
1. Store atomic facts, not paragraphs
One fact per store call — each memory surfaces independently and precisely.
# Bad — one blob, harder to surface individual facts
mem.store("User is Sarah, works at Stripe, likes TypeScript, prefers dark mode, based in NYC")
# Good — one fact per store
mem.store("User's name is Sarah")
mem.store("Sarah works at Stripe as a senior engineer")
mem.store("Sarah prefers TypeScript over JavaScript")
mem.store("Sarah prefers dark mode")
mem.store("Sarah is based in New York")
2. Be specific when querying
The retrieval engine returns the closest semantic match to your query. Vague queries return vague results.
# Too vague
mem.recall("what do they do?")
# Specific — surfaces what you actually need
mem.recall("what is the user's job title and company?")
mem.recall("what programming languages does the user prefer?")
3. Anchor entity names in both store and recall
# Bad — pronoun drift
mem.store("She prefers Python")
# Good — entity anchored
mem.store("Sarah prefers Python")
mem.recall("what does Sarah prefer?")
4. Store signal, not noise
Ask: would this be useful to know in 3 months? If yes, store it. If no, skip it.
# Noise — don't store this
mem.store("User said hello and asked how I'm doing")
# Signal — store this
mem.store("User is building a payments app in React with Stripe integration")
5. Tune top_k to your use case
| Use case | top_k |
| Focused factual question | 3 |
| General context before LLM call | 5 (default) |
| Full context dump | 10–15 |
| Open-ended exploration | 20+ |
Too many memories pollutes your LLM prompt. Too few misses context. 5 is the right default for most conversational apps.
6. Call sync() at session end
# At the end of every conversation session
mem.sync() # consolidates working memory into long-term storage
Without this, recent memories may not surface as reliably in future sessions.
7. Use user_id consistently in multi-tenant apps
# Always use the same identifier for the same person
mem = Client(key, user_id="user-uuid-from-your-db")
# Never mix identifiers for the same user
# Wrong: "alice@email.com" one session, "alice-123" the next
# Memories are namespaced by user_id exactly — different string = different namespace
Rate limits & plans
| Plan | API calls / month | Rate limit | Price |
| Free | 1,000 | 20 req / min | $0 |
| Pro | 50,000 | 120 req / min | $12 / month |
Rate limit errors return HTTP 429. Monthly quota errors return HTTP 402.
Error codes
| HTTP | Error | Meaning |
| 400 | bad_request | Missing or invalid fields. |
| 401 | unauthorized | Invalid or missing API key. |
| 402 | quota_exceeded | Monthly limit reached. Upgrade to Pro. |
| 429 | rate_limited | Too many requests. Slow down. |
| 500 | internal_error | Server error. Try again. |