Is there a JavaScript or Node.js SDK?

Not yet. The REST API works with any language today. The Python SDK is available via pip install becomer. A Node.js SDK is on the roadmap.

Can I use one API key for multiple users in my app?

Yes. Pass a user_id field in any API call to isolate memories per end-user under one master key. For example: {"content": "Alice prefers dark mode", "user_id": "alice-123"}. Each user_id gets a completely separate memory namespace. Billing counts against the master key. Manage and browse per-user memories from the Users tab in your dashboard.

What happens to my data if I cancel?

Your memories are deleted on cancellation. You can export everything from the dashboard before cancelling.

Memory infrastructure for AI developers

Persistent memory for any LLM.
Zero tokens.

Q: What happens when I hit the free tier limit?

You get an HTTP 402 response with a quota_exceeded error. Upgrade to Pro from your dashboard to continue.

Q: How does the MCP integration work?

Run python -m becomer with your BECOMER_API_KEY set and add it to your mcp.json. Claude Desktop and Cursor automatically call store and recall as MCP tools — no code changes needed.

Drop memory into your existing workflow. Works with GPT, Claude, Gemini, Llama, or any model — and they all share the same memory store. Your users' context survives every session, across every LLM. Every run teaches the next — your pipeline compounds in intelligence with zero extra code.

Start free → See how it works

LIVE api.becomer.net

JSON · stateless · per-user ● 200 OK · 163ms

One API key. Every LLM. One shared memory.

Your GPT app, Claude app, and LangChain agent
all remember the same user.

GPT

Claude

Gemini

LangChain

→

BECOMER

SHARED MEMORY STORE

→

GPT

Claude

Gemini

LangChain

# GPT stores it
Client("bk-your-key").store("Sarah prefers TypeScript, works at Stripe")

# Claude recalls it — same key, same store
Client("bk-your-key").recall("who is the user?")
# → ["Sarah prefers TypeScript, works at Stripe"]

01 — Integration

Two lines before. Two lines after.

That's the entire integration. Connect via REST API or MCP — works in any stack.

# Store memory
POST https://api.becomer.net/v1/store
Authorization: Bearer YOUR_API_KEY
{ "content": "User prefers dark mode and concise answers" }

# Recall before your LLM call
POST https://api.becomer.net/v1/recall
Authorization: Bearer YOUR_API_KEY
{ "query": "What does this user prefer?", "top_k": 5 }

# Returns: { "memories": ["User prefers dark mode and concise answers"] }

# Add to your mcp.json
{
  "mcpServers": {
    "becomer": {
      "command": "python",
      "args": ["-m", "becomer"],
      "env": { "BECOMER_API_KEY": "YOUR_API_KEY" }
    }
  }
}

# pip install becomer
import becomer

mem = becomer.Client("YOUR_API_KEY")

# Before your LLM call
context = mem.recall("What does this user like?")

# After your LLM call
mem.store("User asked about Python decorators")

# pip install becomer langchain
from langchain.memory import BaseMemory
from becomer import Client

class BecomerMemory(BaseMemory):
    def __init__(self, api_key):
        self.mem = Client(api_key)

    @property
    def memory_variables(self): return ["history"]

    def load_memory_variables(self, inputs):
        mems = self.mem.recall(inputs.get("input", ""), top_k=5)
        return {"history": "\n".join(mems)}

    def save_context(self, inputs, outputs):
        self.mem.store(f"User: {inputs['input']} | AI: {outputs['output']}")

    def clear(self): self.mem.forget()

# Drop into any chain — persistent across sessions
from langchain.chains import ConversationChain
from langchain_openai import ChatOpenAI

chain = ConversationChain(
    llm=ChatOpenAI(model="gpt-4o"),
    memory=BecomerMemory("YOUR_API_KEY")
)

# pip install becomer llama-index
from llama_index.core import VectorStoreIndex, Settings
from llama_index.core.memory import ChatMemoryBuffer
from becomer import Client

mem = Client("YOUR_API_KEY")

# Seed context before querying
def query_with_memory(engine, user_input):
    context = mem.recall(user_input, top_k=5)
    system_prefix = "What you know about this user:\n" + "\n".join(context)
    Settings.system_prompt = system_prefix

    response = engine.query(user_input)

    # Persist new context
    mem.store(f"User asked: {user_input}. Key points: {str(response)[:300]}")
    return response

# Works with CrewAI, AutoGen, LangGraph, or any agent framework
# Pattern: recall before → run agent → store after
from becomer import Client

mem = Client("YOUR_API_KEY")

def run_agent(user_id: str, task: str):
    # 1. Load relevant context before agent runs
    context = mem.recall(task, top_k=5)

    # 2. Inject into your agent / system prompt
    agent_result = your_agent.run(
        task=task,
        system_context="\n".join(context)
    )

    # 3. Store what was learned for next time
    mem.store(f"Task: {task} | Result: {agent_result[:200]}")
    return agent_result

# OpenAI / Anthropic direct — same pattern
def chat(message: str) -> str:
    context = "\n".join(mem.recall(message, top_k=5))
    response = openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": f"User context:\n{context}"},
            {"role": "user", "content": message}
        ]
    )
    answer = response.choices[0].message.content
    mem.store(f"{message} → {answer}")
    return answer

02 — Benchmarks

Benchmark results

Tested against LongMemEval (n=500) and LOCOMO (n=1,978) — the two industry-standard memory benchmarks. BECOMER matches mem0's LongMemEval score at zero LLM tokens. Same accuracy. 6,787 fewer tokens per query.

Benchmark	BECOMER	Mem0 (June 2026)	Mem0 tokens/query
LongMemEval (overall)	94.4% 0 tokens	94.4%	~6,787
LME — Temporal reasoning	~93%	~93%	~6,787
LME — Knowledge update	~95%	96.2%	~6,787
LME — Multi-session	~87%	86.5%	~6,787
LOCOMO (overall)	69.5% retrieval only	91.6%	~6,956
LOCOMO — Temporal	76.9%	92.8%	~6,956
LOCOMO — Multi-hop	59.6%	93.3%	~6,956

How to read these numbers: BECOMER matches mem0's LongMemEval score (94.4%) while using zero LLM tokens per query. Mem0 uses ~6,787 tokens per recall. Same accuracy, 6,787 fewer tokens — every single query. On LOCOMO, systems that add an LLM reasoning pass score higher; BECOMER retrieves the right context and your own LLM reasons on top. We retrieve, you reason.

Full methodology + 30 iterations + competitor comparison →

~150ms

RECALL · P50

~100ms

STORE · P50

<500ms

RECALL · P99

0ms

LLM WAIT TIME

Measured on CPU-only server. No GPU. No LLM calls. Pure retrieval engine.

03 — Architecture

Built differently

Most memory systems run an LLM under the hood — every recall spends tokens. BECOMER doesn't, so there's nothing to pay per query.

🔗

One key. Every LLM. Shared memory.

Your GPT-4 app, Claude app, and LangChain agent all read and write to the same memory store — no syncing, no duplication.

# App 1 — GPT stores a memory
mem = Client("bk-same-key")
mem.store("User is Sarah, prefers TypeScript")

# App 2 — Claude recalls it instantly
mem = Client("bk-same-key")
mem.recall("who is the user?")
# → ["User is Sarah, prefers TypeScript"]

⚡

Zero tokens consumed

Storage and retrieval happen in our engine. No LLM calls. No per-query token cost.

🔌

Any LLM, any stack

Works with GPT, Claude, Gemini, Llama, Mistral, or your own fine-tuned model.

🔒

Memory stays in our engine

Your users' memories are never sent to a third-party LLM. Pure retrieval, no data leakage.

📊

30+ benchmark iterations

Zero regressions across all phases. 49/49 tests pass. Built to production quality.

📈

Every run teaches the next one

Every store() compounds. An agent that ran 100 tasks has 100 learnings to draw on for task 101 — at zero extra cost. Your pipeline gets smarter the more it runs.

04 — Use Cases

Built for what AI is becoming.

Single-turn chatbots are yesterday. Agents, pipelines, and self-improving systems need persistent memory that works across sessions, LLMs, and processes. The longer they run, the smarter they get.

Shared memory across agents

Multiple agents working on the same task — researcher, executor, reviewer — share a single memory namespace. No message passing. No coordination code. Zero tokens on every recall.

🔍

Research agent stores findings

GPT, Claude, or any LLM

↓

⚡

Executor recalls with zero tokens

Different process, same namespace

↓

✓

Reviewer sees full context

No message passing needed

# Research agent (GPT-4o)
mem = Client("key", user_id="task-abc")
mem.store("API endpoint: POST /v2/payments, OAuth2")
mem.store("Rate limit: 100 req/min, 429 on breach")
mem.store("Auth token expires in 3600s")

# Executor agent (Claude) — different process
# No message passing needed
mem = Client("key", user_id="task-abc")
ctx = mem.recall("payment API details")
# → ["API endpoint: POST /v2/payments, OAuth2",
#    "Rate limit: 100 req/min...",
#    "Auth token expires in 3600s"]

# Reviewer agent (Gemini) — same namespace
audit = mem.recall("what was implemented?")

Systems that learn from themselves

Store every attempt with its outcome. Recall what worked before the next run. Semantic retrieval surfaces similar past attempts even with different phrasing — no structured query language needed.

ITERATION 1

Zero-shot prompt — 71%

ITERATION 2

Chain-of-thought — 84%

ITERATION 3 — recalled best approach

Few-shot + CoT — 91%

# Store every attempt with outcome
mem.store("Approach: zero-shot. Score: 71%. "
          "Weakness: misses edge cases")
mem.store("Approach: chain-of-thought. Score: 84%."
          "Strength: handles multi-step reasoning")

# Before next run — recall what worked
history = mem.recall(
    "what approaches scored highest?",
    top_k=10
)

# Inject learnings into system prompt
system = "Previous learnings:\n" + \
         "\n".join(history)

# System now builds on its own history
# Gets smarter every iteration

Pick up where you left off

Agents that run for hours, get interrupted, restart days later — and continue exactly where they stopped. No manual state management. No checkpoint files.

📋

Day 1 — session ends

Progress stored in one call

↓

⏱

Day 3 — new session

Full context recalled instantly

↓

→

Continues from step 7

No files, no checkpoints

# Day 1 — agent works, then gets interrupted
mem.store("Completed steps 1-6: schema designed")
mem.store("Step 7 in progress: auth middleware")
mem.store("Blocked: choosing between authlib "
          "vs python-jose — evaluate tomorrow")

# Day 3 — fresh session, full context restored
status = mem.recall("what was I working on?")
blockers = mem.recall("what am I blocked on?")

# → ["Step 7 in progress: auth middleware"]
# → ["Blocked: choosing between authlib...]

# Agent continues from exactly step 7
# No files, no checkpoints, no setup

05 — Pricing

Simple pricing

Start free. Scale when you're ready.

Free

/ month

1,000 API calls / month
Unlimited memories stored
REST API + MCP
Dashboard access

Get started free

Pro

50,000 API calls / month
Unlimited memories stored
REST API + MCP
Priority support
Usage analytics

Start Pro

Team

Coming soon

500,000 API calls / month
Unlimited memories
Team API keys
SLA + dedicated support

06 — Security

Your data, sealed shut.

We are infrastructure, not an audience. Your stored content is encrypted, isolated per account, and never sent to any third party for processing.

🔒

Encrypted at rest

Your memories sit in an encrypted database. Even direct disk access reveals nothing readable.

🚫

No third-party processing

Your data is never sent to any external LLM, AI service, or analytics provider — and never sold or shared. Stored on dedicated cloud infrastructure, isolated per account.

🌐

Compliance built in

DPDP Act 2023 compliant. CCPA-aligned for US users. Your data stays on our infrastructure and never crosses into a third-party pipeline.

🧱

Isolated per account

Database-level isolation means one customer can never read another's memories. Enforced by the engine, not just code.

07 — FAQ

Common questions

BECOMER's retrieval engine handles storage and recall without calling any language model. Your LLM call happens outside BECOMER — we return the relevant context for you to inject into your prompt. No GPT, Claude, or Gemini is invoked inside the memory layer.

Memories are encrypted at rest, isolated per account at the database level, and never sent to any external AI service or analytics provider. One account can never read another's data — enforced by the database engine, not just application code.

Yes — pass a user_id in any API call and BECOMER automatically isolates that user's memories under your master key. One key covers your entire user base.

mem.store("Alice loves TypeScript", user_id="alice-123")
mem.recall("preferences", user_id="alice-123")  # → Alice's only

Each user_id is fully isolated — one user can never read another's memories. Billing counts against the master key. Browse and delete per-user memories from the Users tab in your dashboard.

You receive an HTTP 402 response with a quota_exceeded error. Upgrade to Pro from your dashboard to continue. Quotas reset on the 1st of each month.

Yes. Install with npm install @becomerpackage/sdk — zero dependencies, Node 18+, TypeScript types included. Python SDK: pip install becomer.

Run python -m becomer with your BECOMER_API_KEY set and add it to your mcp.json. Claude Desktop and Cursor automatically call store and recall as MCP tools — no code changes needed. For multi-tenant apps, also set BECOMER_USER_ID to scope memory to a specific end-user. Full config in docs.

No. BECOMER is a managed cloud service. There is no self-hosted option at this time.

Your memories are deleted on cancellation. Export everything from the dashboard first using the export button — it downloads all your stored memories as a text file.

How does BECOMER compare to mem0, Zep, and Hindsight?

Side-by-side benchmark scores, token cost, pricing, and feature comparison.

See full comparison →

Persistent memory for any LLM.Zero tokens.

Persistent memory for any LLM.
Zero tokens.