Most vector databases treat memories as isolated points in high-dimensional space. They're good at finding similar vectors, but they miss something important: relationships.
Humans don't think in isolated facts. We think in webs of connected concepts. A memory about "React useEffect" is related to "JavaScript closures," which connects to "scope and hoisting," which links to "variable declaration patterns."
When we built takizen's Memory Graph, we wanted to capture this web of meaning. Here's how we did it.
The Data Model
At its core, the Memory Graph has two entities:
- Memories: The nodes. Each has content, an embedding vector, tags, and metadata like strength and recall count.
- Links: The edges. Typed relationships between memories with a semantic label.
The link types matter. We support:
related_to— General associationsupports— One memory reinforces anothercontradicts— One memory opposes anotherpart_of— Hierarchical compositioncaused_by— Causal relationship
These aren't just labels. They affect how the graph is traversed during recall. A "supports" link might strengthen a neighbor's relevance score, while a "contradicts" link might surface counterpoints.
Auto-Linking
Manual linking is powerful but tedious. Most users won't spend time wiring up their knowledge graph.
So we built auto-linking. When you remember something new, takizen:
- Finds the most similar existing memories (threshold 0.75)
- Creates
related_tolinks to the top 5 matches - Does this best-effort — if it fails, your remember still succeeds
The result is that your graph grows organically. Every new memory automatically connects to relevant existing context.
Hybrid Search: RRF
Semantic search finds memories similar to your query. Graph search finds memories connected to relevant nodes. Both are useful. We wanted both.
Enter Reciprocal Rank Fusion (RRF). It's a technique for combining ranked lists that doesn't require score normalization. Here's the formula:
RRF_score = Σ 1 / (k + rank_i)
For each memory, we sum its fused score from both semantic similarity ranking and graph proximity ranking. The k constant (we use 60) dampens the impact of absolute rank differences.
The result: memories that are both semantically similar AND well-connected in the graph bubble to the top.
The Implementation
Technically, the Memory Graph lives in Supabase PostgreSQL with pgvector. Key details:
- Embeddings: 1536-dim vectors from OpenAI's text-embedding-3-small via OpenRouter
- Index: HNSW (Hierarchical Navigable Small World) for fast approximate nearest neighbor search
- Links: Simple join table with from_id, to_id, type, and namespace isolation
- Traversal: Recursive CTEs for graph expansion within a namespace
The graph operations are wrapped in RPC functions for clean API boundaries:
match_memories— Pure semantic searchget_memory_neighbors— Graph expansion from a seed setmatch_memories_hybrid— RRF fusion of both
Performance
Graph operations can be expensive. We've optimized for our use case:
- Namespace isolation means we never traverse the entire graph — just one user's subgraph
- We limit graph expansion to 2 hops from seed memories
- Neighbor retrieval is batched and capped at 20 results
- The HNSW index handles the heavy lifting of similarity search
In practice, hybrid recall adds ~20-30ms to query time. Worth it for the quality improvement.
What's Next
The Memory Graph is still young. We're exploring:
- Weighted links — Let the graph learn which connections matter most
- Temporal edges — "Before/after" relationships for procedural memories
- Community detection — Automatically cluster related memories into topics
The goal is the same: make AI memory more like human memory. Connected, contextual, and alive.