Quickstart
The installer puts memspine-server and memspine-admin in ~/.local/bin — prebuilt for Linux x86_64 and macOS arm64, checksums verified, no root, no dependencies. Start the server and talk to it with curl; in v0 the Bearer token is the tenant id, so there is no control-plane setup:
# embeddings default to 384 dims; pin to 4 for this toy walkthrough
MEMSPINE_BIND=0.0.0.0:7777 MEMSPINE_EMBEDDING_DIM=4 memspine-server
# append, then traverse provenance
curl -X POST localhost:7777/v1/memory/event \
-H "Authorization: Bearer 00000000-0000-0000-0000-000000000042" \
-H "Content-Type: application/json" \
-d '{"kind":"tool_call","embedding":[0.1,0.2,0.3,0.4],
"properties":{"name":"web.fetch","url":"https://example.com"}}'
curl -X POST localhost:7777/v1/memory/cypher \
-H "Authorization: Bearer 00000000-0000-0000-0000-000000000042" \
-H "Content-Type: application/json" \
-d '{"query":"MATCH (n:Event)-[:SOURCE_FROM]->(m:Event) RETURN count(n)"}'
Every event lands in the tenant's own FFS database file — one file per tenant, opened on first request and kept warm in a pool. There is no shared table for a noisy neighbour to sit in.
API
Auth is Authorization: Bearer <key> with per-key scopes enforced on every handler.
| Route | Scope | |
|---|---|---|
| GET /health | — | liveness probe |
| GET /metrics | — | Prometheus exposition, per-tenant series |
| GET /v1/whoami | read | echo the resolved tenant |
| POST /v1/memory/event | write | append a typed event, with optional embedding |
| POST /v1/memory/search | read | vector search, optional Cypher predicate |
| POST /v1/memory/cypher | read | Cypher over events and provenance edges |
| POST /v1/admin/tenants | admin | create tenant |
| GET /v1/admin/tenants | admin | list tenants |
| POST /v1/admin/keys | admin | mint a key — plaintext returned once |
| DELETE /v1/admin/keys/<id> | admin | revoke a key |
| GET /v1/admin/audit | admin | paginate the admin audit log |
SDKs ship with the server: async Rust, Python, and TypeScript clients built from the same core, plus Idempotency-Key support on writes.
Architecture
HTTP axum + tower-http (tracing, CORS)
auth Bearer token → tenant, scope check per handler
core per-tenant engine pool
ingest · vector search · Cypher · provenance
ffs graph + vector + columnar, one file per tenant
memspine is deliberately the boring layer. FFS does the hard engine work — pages, WAL, MVCC, HNSW, atomic commits across graph and vector writes. memspine adds what a service needs and an engine shouldn't carry: tenancy, keys and scopes, rate limits, quotas, metrics, and an audit trail.
Operations
Every response carries x-memspine-trace-id — caller-supplied or generated — and the same id is on every span the request produced. A log aggregator pivots from one HTTP call to everything it touched.
/metrics exposes Prometheus series per tenant: write rate, query latency, storage growth, vector-index churn.
Per-tenant token buckets keep a runaway agent from taking down its neighbours or silently burning capacity.
Admin actions append to an immutable log, paginated over the API.
FAQ
What is agent memory?
The durable record of everything an AI agent does and learns — tool calls, decisions, retrieval traces, derived facts. memspine stores those events as a typed graph with embeddings, so agents recall what happened, find similar past events, and trace where any fact came from.
How is memspine different from a vector database?
A vector database answers one question: what's similar. Agent memory also needs what happened (typed events in order) and where did this come from (provenance edges). memspine keeps all three in one engine instead of bolting a vector store onto a separate event log.
Does it work with any LLM or agent framework?
Yes — it's a plain HTTP API with Bearer-token auth. Anything that can make an HTTP request can append and search memory. SDKs ship for Rust, Python, and TypeScript.
Can I self-host it?
Yes. One command installs a single Rust binary with no dependencies. Each tenant's data lives in its own FFS file on your disk.
How does multi-tenancy work?
Every tenant gets a dedicated database file, per-tenant rate limits, and scoped API keys (read / write / admin). No shared table, so one tenant's load never sits in another's query path.
Status
Early and honest about it. The HTTP surface, scopes, rate limits, admin plane, audit log, vector search, edge traversal, and the FFS storage path work end to end, gated by CI on every push. The v0 executor returns node ids and counts; richer Cypher projections arrive with the FFS columnar read path. Running an agent fleet and want managed memory under it? Write to sd@erp.ai.