feat: memory store and agent long-term memory#1266
Conversation
|
Yea there is a LLM based extraction step here: However there isn't a "consolidation" step which means before inserting memory directly after extraction, find relevant entries in the db, show them to another LLM and ask it to update / add these entries, this is commonly done in e.g. Vertex AI memory bank. I marked that in the limitations section but that will be more complicated to do.
With this PR if you enable memory on declarative agents you only use the kagent memory store, for those two options you can use BYO for now. But I'm working on refactoring the interface for extending to other memory stores I'll also think about if we can allow things like Vertex memory service. |
0da4998 to
2bf6a2f
Compare
There was a problem hiding this comment.
Pull request overview
Implements the initial “memory store” feature end-to-end (CRD/translator → Python runtime memory service/tools → Go controller API + DB/vector support → UI surfacing/management), aligning with Issue #1256 and the linked design doc.
Changes:
- Adds Go controller
/api/memories/*endpoints plus DB models/client operations for vector-backed memory storage, search, listing, deletion, and TTL pruning (pgvector for Postgres; Turso/libSQL vectors for SQLite). - Extends the Python ADK runtime with a
KagentMemoryServiceplus memory tools (save/load/prefetch) and auto-save callback wiring viaAgentConfig. - Adds UI types, server actions, and dialogs/menus to view and clear an agent’s stored memories.
Reviewed changes
Copilot reviewed 37 out of 39 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| ui/src/types/index.ts | Adds AgentMemory type used by the new memory UI. |
| ui/src/components/MemoriesDialog.tsx | New dialog to list and clear an agent’s memories. |
| ui/src/components/DeleteAgentButton.tsx | Supports externally-controlled delete dialog (for dropdown menu usage). |
| ui/src/components/AgentCard.tsx | Replaces inline buttons with an options dropdown; wires delete + memories dialogs. |
| ui/src/app/actions/memories.ts | Adds server actions to list/clear memories via backend API. |
| python/uv.lock | Updates Python dependency lockfile (incl. numpy + telemetry-related shifts). |
| python/packages/kagent-adk/src/kagent/adk/types.py | Adds memory-related config (memory_enabled, embedding) and injects memory tools/callbacks. |
| python/packages/kagent-adk/src/kagent/adk/tools/prefetch_memory_tool.py | Adds prefetch tool to inject relevant past context on the first user message. |
| python/packages/kagent-adk/src/kagent/adk/tools/memory_tools.py | Adds explicit save/load memory tools for agents. |
| python/packages/kagent-adk/src/kagent/adk/cli.py | Passes agent_config into app construction so runtime can enable memory. |
| python/packages/kagent-adk/src/kagent/adk/_memory_service.py | Implements KagentMemoryService (embedding, summarization, store/search via Go API). |
| python/packages/kagent-adk/src/kagent/adk/_a2a.py | Wires KagentMemoryService into ADK Runner when enabled. |
| python/packages/kagent-adk/pyproject.toml | Adds numpy dependency for embedding normalization. |
| helm/kagent/values.yaml | Adds vector enablement flags for sqlite/postgres backends. |
| helm/kagent/templates/controller-configmap.yaml | Exposes DATABASE_VECTOR_ENABLED to controller via env var. |
| helm/kagent-crds/templates/kagent.dev_agents.yaml | Extends Agent CRD schema with spec.declarative.memory. |
| go/pkg/database/models.go | Adds Memory model (+ search result) and updates several existing models. |
| go/pkg/database/client.go | Extends DB client interface with agent-memory methods. |
| go/pkg/app/app.go | Plumbs VectorEnabled into DB configuration and exposes CLI flag. |
| go/pkg/adk/types.go | Adds EmbeddingConfig + memory_enabled/embedding in serialized AgentConfig. |
| go/internal/httpserver/server.go | Starts periodic background TTL pruning task and updates memory routes. |
| go/internal/httpserver/handlers/memory_test.go | Replaces prior K8s-CRD memory tests with new API-based memory endpoint tests. |
| go/internal/httpserver/handlers/memory.go | Implements new memory endpoints: add (single/batch), search, list, delete. |
| go/internal/database/manager.go | Adds pgvector extension/index creation and Turso/libSQL vector table creation for sqlite. |
| go/internal/database/fake/client.go | Adds stubbed agent-memory methods; adjusts feedback storage behavior. |
| go/internal/database/client_test.go | Adds sqlite vector-backed tests for store/search/batch/delete/prune flows. |
| go/internal/database/client.go | Implements store/search/list/delete/prune for agent memory across sqlite+postgres. |
| go/internal/controller/translator/agent/testdata/outputs/agent_with_memory.json | Adds golden output for agent translation with memory enabled. |
| go/internal/controller/translator/agent/testdata/inputs/agent_with_memory.yaml | Adds golden input for agent translation with memory config + embedding ModelConfig. |
| go/internal/controller/translator/agent/testdata/README.md | Updates memory testdata documentation to match new long-term memory design. |
| go/internal/controller/translator/agent/adk_api_translator.go | Translates CRD memory config into ADK AgentConfig embedding settings + secret/env merging. |
| go/go.mod | Adds pgvector-go and Turso driver dependency; updates indirect deps. |
| go/go.sum | Updates module checksums for added/shifted Go dependencies. |
| go/api/v1alpha2/agent_types.go | Adds MemorySpec to the Agent API type. |
| go/api/v1alpha2/zz_generated.deepcopy.go | Adds deepcopy support for the new MemorySpec field. |
| go/config/crd/bases/kagent.dev_agents.yaml | Updates generated CRD bases with new memory schema. |
| go/Dockerfile | Switches to distroless cc-debian12 base to support Turso/purego runtime needs. |
| design/EP-1256-memory.md | Adds detailed design doc for the memory store feature. |
| contrib/addons/postgres.yaml | Switches addon image to pgvector-enabled Postgres image for local/dev. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
EItanya
left a comment
There was a problem hiding this comment.
PR Review: feat: memory store and agent long-term memory
This PR adds long-term memory to kagent agents using vector similarity search (pgvector for Postgres, libSQL vector for SQLite). It spans 39 files across Go, Python, UI, and Helm. Here's my review:
High Severity
1. Missing self parameter in _normalize_l2 — python/packages/kagent-adk/src/kagent/adk/_memory_service.py
The method is defined as a regular method but missing self:
def _normalize_l2(x): # should be: def _normalize_l2(self, x)It's called as self._normalize_l2(embedding), so this will crash at runtime when an embedding exceeds 768 dimensions and needs normalization.
2. Incorrect vector type check — _memory_service.py:89-91
if isinstance(vectors[0], float):
vectors = [vectors]If vectors[0] is a float, the entire vectors list is a flat list of floats (a single vector), and wrapping it produces [[0.1, 0.2, ...]] — which is correct. But the check should be not isinstance(vectors[0], (list, np.ndarray)) to also handle numpy arrays properly. As written, a numpy array element would slip through and cause downstream failures.
3. Fake client stubs make handler tests unreliable — go/internal/database/fake/client.go
All 6 agent memory methods are no-op stubs returning empty results. The handler tests in memory_test.go acknowledge this ("Fake client returns empty results, which is fine for this test"), but it means the test suite provides no coverage of actual memory storage or retrieval behavior. Other methods in the fake client (e.g. CrewAI) are fully implemented — these should be too.
Medium Severity
4. No batch size limit — go/internal/httpserver/handlers/memory.go:113-158
The AddSessionBatch endpoint accepts an unbounded number of items. A caller can send millions of memories in one request with no limit. Add a MAX_BATCH_SIZE check.
5. No vector dimension validation — handlers/memory.go:73-76, 168-171
The handlers check len(req.Vector) == 0 but never validate that the vector is 768-dimensional. Storing a vector of wrong dimensionality into a F32_BLOB(768) column will cause subtle bugs or silent data corruption.
6. Memory table not cleaned up in Reset() — go/internal/database/manager.go
The memory table is manually created in Initialize() (not via AutoMigrate), but Reset() never drops it. This leaves orphaned data when the database is reset.
7. Auto-save only triggers every 5th user message — python/packages/kagent-adk/src/kagent/adk/types.py:460
if user_msg_count > 0 and user_msg_count % 5 == 0:Short conversations (< 5 turns) will never have memories saved automatically. The threshold should be configurable or the strategy reconsidered (e.g., save on session end instead of on a modulus count).
8. No validation that embedding config exists when memory_enabled=True — python/packages/kagent-adk/src/kagent/adk/types.py:238-240
If memory_enabled=True but embedding is None, the memory service will warn at runtime and silently fail. This should be validated at config load time (e.g. via a Pydantic validator) so it fails fast.
9. Inconsistent API access pattern in memory tools — python/packages/kagent-adk/src/kagent/adk/tools/memory_tools.py
SaveMemoryTool directly accesses tool_context._invocation_context.memory_service (protected member), while LoadMemoryTool uses the public tool_context.search_memory(). These should use the same access pattern.
10. Broad except Exception: pass blocks — _memory_service.py:268-270, 278-279
Multiple locations silently swallow all exceptions without logging. This makes debugging production issues extremely difficult. At minimum, log a warning.
Low Severity
11. TTL hardcoded to 15 days — handlers/memory.go:79, 126
Not configurable via the MemorySpec CRD or any API. Consider adding a ttl field to MemorySpec.
12. fmt.Printf() used instead of structured logger — go/internal/database/client.go:670
The rest of the codebase uses log.Info/log.Error. This should be consistent.
13. Commented-out debug logging — handlers/memory.go:63, 71
Remove commented-out debug code before merging.
14. Deprecated _derive_search_query method still present — prefetch_memory_tool.py:87-117
Marked # TODO: DEPRECATED but still fully implemented. Remove it or explain why it's retained.
15. UUID default gen_random_uuid() is Postgres-specific — go/pkg/database/models.go:209
The GORM tag default:gen_random_uuid() only works on Postgres. SQLite will need app-level UUID generation or a compatible function from Turso.
16. Missing composite index on (agent_name, user_id) — go/pkg/database/models.go
Most queries filter by both agent_name and user_id. A composite index would improve query performance.
17. Hardcoded 768 embedding dimensions — _memory_service.py:338
response = await aembedding(model=litellm_model, input=texts, dimensions=768)Not all models support arbitrary dimension truncation. Consider making this configurable or documenting the assumption.
18. Loose LiteLLM version constraint — pyproject.toml
litellm>=1.74.3 without an upper bound. LiteLLM has frequent API changes; consider <2.0 cap.
Design / Architecture Observations
CRD field naming: The BYOAgentSpec has memory as a string array (line 2210 in the CRD yaml), while DeclarativeAgentSpec has memory as an object with enabled/modelConfig (line 9866). These are on different specs so it's not a conflict per se, but the naming reuse is confusing. Verify this is intentional.
No Python tests: There are no unit tests for _memory_service.py, memory_tools.py, or prefetch_memory_tool.py. Given the complexity of embedding generation, summarization, and vector search, this needs test coverage.
Incomplete Go handler test coverage: No tests for the batch, list, or delete endpoints. Only the single-store and search endpoints have tests.
Summary
| Severity | Count | Key Items |
|---|---|---|
| High | 3 | Missing self param (runtime crash), vector type check bug, fake client stubs |
| Medium | 7 | No batch limit, no dimension validation, Reset() leak, auto-save threshold, config validation, API inconsistency, swallowed exceptions |
| Low | 8 | Hardcoded TTL, printf logging, dead code, UUID default, missing index, etc. |
Recommendation: The high-severity items (especially #1 and #2) are runtime-crashing bugs that must be fixed before merge. The medium items around input validation (#4, #5) and test coverage should also be addressed. The design and architecture are sound overall — the vector similarity search approach is well-thought-out, and the design doc is thorough.
go/api/v1alpha2/agent_types.go
Outdated
| // The name of the model config to use for embeddings. | ||
| // If not specified, it defaults to the agent's main ModelConfig. | ||
| // +optional | ||
| ModelConfig string `json:"modelConfig,omitempty"` |
There was a problem hiding this comment.
Please use *corev1.LocalObjectReference
There was a problem hiding this comment.
Please use
*corev1.LocalObjectReference
Why should we not use string since the upper level ModelConfig uses that
| // Merge EnvVars from embedding config (e.g. API Keys) | ||
| mdd.EnvVars = append(mdd.EnvVars, embMdd.EnvVars...) | ||
| mdd.Volumes = append(mdd.Volumes, embMdd.Volumes...) | ||
| mdd.VolumeMounts = append(mdd.VolumeMounts, embMdd.VolumeMounts...) |
There was a problem hiding this comment.
What about duplicates? I believe they may make the Deployment invalid. e.g. if the OPENAI_API_KEY is the LLM and the embedding model.
There was a problem hiding this comment.
Added deduplication of env vars and volumes when translating embedding config, so that the user can use a different provider for embedding vs LLM but avoids duplication
| # Use distroless/cc-debian12 which includes C/C++ runtime libraries | ||
| # This is required for turso-go's purego library loading | ||
| # Refer to https://github.com/GoogleContainerTools/distroless for more details | ||
| FROM gcr.io/distroless/static:nonroot | ||
| FROM gcr.io/distroless/cc-debian12:nonroot |
There was a problem hiding this comment.
This is a pretty big change. Why do we need this if we don't need CGO?
There was a problem hiding this comment.
I believe we do since otherwise I ran into some error from Turso
860ddf1 to
dc99d23
Compare
|
Fixed most comments above 😄 will be adding e2e test to memory together with this one |
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
d97a5ae to
28ac76a
Compare
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
|
Short demo (really simple example): Screen.Recording.2026-02-25.at.3.18.27.PM.movThis does not show auto memory extraction, auto prefetching memory, and auto memory cleanup. |
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
Signed-off-by: Jet Chiang <[email protected]>
|
|
||
| // EmbeddingConfig is the embedding model config for memory tools. | ||
| // JSON uses "provider" to match Python EmbeddingConfig; unmarshaling accepts "type" for backward compat. | ||
| type EmbeddingConfig struct { |
There was a problem hiding this comment.
Please add more unit tests for this in a follow-up. The custom marshalling in this function is starting to get pretty complex

Initial design: #1256
Detailed design doc: design/EP-1256-memory.md