When we talk about AI memory, we often focus on storage and retrieval. Can the AI remember what I told it yesterday? Can it find the right context when I ask a question? These are important problems, but they're only half the story.
The other half is learning. Not just remembering facts, but learning which facts matter in which contexts. Learning that some combinations of memories lead to better conclusions than others.
That's why we built CoT Feedback.
What is Chain of Thought Feedback?
Chain of Thought (CoT) is a technique where AI agents show their work—they explain their reasoning step by step before giving a final answer. It's powerful because it makes the AI's thinking process visible and verifiable.
But there's a gap. When an AI uses multiple memories to solve a problem, how does it know which memories helped and which didn't? Without feedback, every recall is equally weighted, regardless of whether the resulting reasoning was sound.
CoT Feedback closes this loop. After a Chain of Thought session, the AI (or its operator) can provide feedback on how effective the retrieved memories were:
- Which memories were used in the reasoning chain
- How effective they were (0-1 score)
- Why they worked or didn't work (optional notes)
Over time, this creates a learning signal. Memories that consistently contribute to successful reasoning get boosted in future recalls. Memories that lead to errors get deprioritized.
How It Works
The implementation has three parts:
1. Session Tracking
When an AI starts a reasoning task, we create a cot_session record with the context query. Every memory retrieved during that session is linked to it via cot_session_memories.
2. Effectiveness Scoring
After the reasoning completes, feedback is submitted via the cot_feedback tool. The score (0-1) represents how effective the memory combination was for this specific context.
3. Recall Boosting
When the AI recalls memories with boost_by_cot: true, we combine semantic similarity with the historical effectiveness scores. The result is a boosted score that prioritizes memories with proven track records.
// Without CoT boost
await recall({ query: "How to optimize this database?" })
// Returns: memories ranked by pure vector similarity
// With CoT boost
await recall({
query: "How to optimize this database?",
boost_by_cot: true,
cot_boost_weight: 0.3 // 30% CoT score, 70% similarity
})
// Returns: memories ranked by combined effectiveness
The Decay Factor
Effectiveness isn't static. A memory that was helpful six months ago might not be relevant today. That's why CoT feedback has its own decay mechanism (0.02/day, slower than memory decay's 0.05/day).
This means recent feedback has more weight than old feedback. The system adapts to changing contexts without forgetting hard-won lessons.
Real-World Impact
We're already seeing CoT Feedback make a difference in how agents use takizen:
- Support agents learn which troubleshooting memories lead to resolved tickets
- Code assistants discover which pattern memories produce working code
- Research agents identify which source combinations yield accurate summaries
The key insight is that context matters. A memory about React hooks might be crucial for a frontend task and irrelevant for a database optimization. CoT Feedback learns these distinctions from experience.
Getting Started
CoT Feedback is available now for all takizen users. To use it:
- Enable
boost_by_cotin your recall calls - After Chain of Thought reasoning, submit feedback via
cot_feedback - Adjust
cot_boost_weightbased on your use case (higher for established domains, lower for novel problems)
We're excited to see how the community uses this. AI memory isn't just about storage anymore—it's about learning what works.